Обсуждение: Patch: dumping tables data in multiple chunks in pg_dump

Поиск

Список

Период

Сортировка

Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

11 ноября, 18:29:56

Attached is a patch that adds the ability to dump table data in multiple chunks.

Looking for feedback at this point:
 1) what have I missed
 2) should I implement something to avoid single-page chunks

The flag --huge-table-chunk-pages which tells the directory format
dump to dump tables where the main fork has more pages than this in
multiple chunks of given number of pages,

The main use case is speeding up parallel dumps in case of one or a
small number of HUGE tables so parts of these can be dumped in
parallel.

It will also help in case the target file system has some limitations
on file sizes (4GB for FAT, 5TB for GCS).

Currently no tests are included in the patch and also no extra
documentation outside what is printed out by pg_dump --help . Also any
pg_log_warning lines with "CHUNKING" is there for debugging and needs
to be removed before committing.

As  implemented no changes are needed for pg_restore as all chunks are
already associated with the table in .toc and thus are restored into
this table

the attached README shows how I verified it works  and the textual
file created from the directory format dump in the last step there

--
Hannu

Вложения

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Ashutosh Bapat

Дата:

12 ноября, 15:59:33

Hi Hannu,

On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
>
> Attached is a patch that adds the ability to dump table data in multiple chunks.
>
> Looking for feedback at this point:
>  1) what have I missed
>  2) should I implement something to avoid single-page chunks
>
> The flag --huge-table-chunk-pages which tells the directory format
> dump to dump tables where the main fork has more pages than this in
> multiple chunks of given number of pages,
>
> The main use case is speeding up parallel dumps in case of one or a
> small number of HUGE tables so parts of these can be dumped in
> parallel.

Have you measured speed up? Can you please share the numbers?

--
Best Wishes,
Ashutosh Bapat

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

13 ноября, 21:02:43

I just ran a test by generating a 408GB table and then dumping it both ways

$ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
/tmp/plain.dump largedb

real    39m54.968s
user    37m21.557s
sys     2m32.422s

$ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb

real    5m52.965s
user    40m27.284s
sys     3m53.339s

So parallel dump with 8 workers using 1GB (128k pages) chunks runs
almost 7 times faster than the sequential dump.

this was a table that had no TOAST part. I will run some more tests
with TOASTed tables next and expect similar or better improvements.



On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Hannu,
>
> On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
> >
> > Attached is a patch that adds the ability to dump table data in multiple chunks.
> >
> > Looking for feedback at this point:
> >  1) what have I missed
> >  2) should I implement something to avoid single-page chunks
> >
> > The flag --huge-table-chunk-pages which tells the directory format
> > dump to dump tables where the main fork has more pages than this in
> > multiple chunks of given number of pages,
> >
> > The main use case is speeding up parallel dumps in case of one or a
> > small number of HUGE tables so parts of these can be dumped in
> > parallel.
>
> Have you measured speed up? Can you please share the numbers?
>
> --
> Best Wishes,
> Ashutosh Bapat

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

13 ноября, 21:39:26

Going up to 16 workers did not improve performance , but this is
expected, as the disk behind the database can only do 4TB/hour of
reads, which is now the bottleneck. (408/352/*3600 = 4172 GB/h)

$ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=131072 -j 16 -f /tmp/parallel16.dump largedb
real    5m44.900s
user    53m50.491s
sys     5m47.602s

And 4 workers showed near-linear speedup from single worker

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=131072 -j 4 -f /tmp/parallel4.dump largedb
real    10m32.074s
user    38m54.436s
sys     2m58.216s

The database runs on a 64vCPU VM with 128GB RAM, so most of the table
will be read in from the disk






On Thu, Nov 13, 2025 at 7:02 PM Hannu Krosing <hannuk@google.com> wrote:
>
> I just ran a test by generating a 408GB table and then dumping it both ways
>
> $ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
> /tmp/plain.dump largedb
>
> real    39m54.968s
> user    37m21.557s
> sys     2m32.422s
>
> $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
>
> real    5m52.965s
> user    40m27.284s
> sys     3m53.339s
>
> So parallel dump with 8 workers using 1GB (128k pages) chunks runs
> almost 7 times faster than the sequential dump.
>
> this was a table that had no TOAST part. I will run some more tests
> with TOASTed tables next and expect similar or better improvements.
>
>
>
> On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > Hi Hannu,
> >
> > On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
> > >
> > > Attached is a patch that adds the ability to dump table data in multiple chunks.
> > >
> > > Looking for feedback at this point:
> > >  1) what have I missed
> > >  2) should I implement something to avoid single-page chunks
> > >
> > > The flag --huge-table-chunk-pages which tells the directory format
> > > dump to dump tables where the main fork has more pages than this in
> > > multiple chunks of given number of pages,
> > >
> > > The main use case is speeding up parallel dumps in case of one or a
> > > small number of HUGE tables so parts of these can be dumped in
> > > parallel.
> >
> > Have you measured speed up? Can you please share the numbers?
> >
> > --
> > Best Wishes,
> > Ashutosh Bapat

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

13 ноября, 23:24:33

Ran another test with a 53GB database where most of the data is in TOAST

CREATE TABLE just_toasted(
  id serial primary key,
  toasted1 char(2200) STORAGE EXTERNAL,
  toasted2 char(2200) STORAGE EXTERNAL,
  toasted3 char(2200) STORAGE EXTERNAL,
  toasted4 char(2200) STORAGE EXTERNAL
);

and the toast fields were added in somewhat randomised order.

Here the results are as follows

Parallelism   |   chunk size (pages)   |    time (sec)
 1        |    -         |     240
 2        |  1000    |     129
 4        |  1000    |      64
 8        |  1000    |      36
16       |  1000    |      30

 4        |  9095    |      78
 8        |  9095    |      42
16       |  9095    |      42

The reason larger chunk sizes performed worse was that they often had
one or two stragglers left behind which

Detailed run results below:

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres -f
/tmp/ltoastdb-1-plain.dump largetoastdb
real    3m59.465s
user    3m43.304s
sys     0m15.844s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=9095 -j 4 -f /tmp/ltoastdb-4.dump
largetoastdb
real    1m18.320s
user    3m49.236s
sys     0m19.422s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=9095 -j 8 -f /tmp/ltoastdb-8.dump
largetoastdb
real    0m42.028s
user    3m55.299s
sys     0m24.657s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=9095 -j 16 -f /tmp/ltoastdb-16.dump
largetoastdb
real    0m42.575s
user    4m11.011s
sys     0m26.110s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=1000 -j 16 -f /tmp/ltoastdb-16-1kpages.dump
largetoastdb
real    0m29.641s
user    6m16.321s
sys     0m49.345s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=1000 -j 8 -f /tmp/ltoastdb-8-1kpages.dump
largetoastdb
real    0m35.685s
user    3m58.528s
sys     0m26.729s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=1000 -j 4 -f /tmp/ltoastdb-4-1kpages.dump
largetoastdb
real    1m3.737s
user    3m50.251s
sys     0m18.507s

hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
--format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=1000 -j 2 -f /tmp/ltoastdb-2-1kpages.dump
largetoastdb
real    2m8.708s
user    3m57.018s
sys     0m18.499s

On Thu, Nov 13, 2025 at 7:39 PM Hannu Krosing <hannuk@google.com> wrote:
>
> Going up to 16 workers did not improve performance , but this is
> expected, as the disk behind the database can only do 4TB/hour of
> reads, which is now the bottleneck. (408/352/*3600 = 4172 GB/h)
>
> $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=131072 -j 16 -f /tmp/parallel16.dump largedb
> real    5m44.900s
> user    53m50.491s
> sys     5m47.602s
>
> And 4 workers showed near-linear speedup from single worker
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=131072 -j 4 -f /tmp/parallel4.dump largedb
> real    10m32.074s
> user    38m54.436s
> sys     2m58.216s
>
> The database runs on a 64vCPU VM with 128GB RAM, so most of the table
> will be read in from the disk
>
>
>
>
>
>
> On Thu, Nov 13, 2025 at 7:02 PM Hannu Krosing <hannuk@google.com> wrote:
> >
> > I just ran a test by generating a 408GB table and then dumping it both ways
> >
> > $ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
> > /tmp/plain.dump largedb
> >
> > real    39m54.968s
> > user    37m21.557s
> > sys     2m32.422s
> >
> > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
> >
> > real    5m52.965s
> > user    40m27.284s
> > sys     3m53.339s
> >
> > So parallel dump with 8 workers using 1GB (128k pages) chunks runs
> > almost 7 times faster than the sequential dump.
> >
> > this was a table that had no TOAST part. I will run some more tests
> > with TOASTed tables next and expect similar or better improvements.
> >
> >
> >
> > On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > Hi Hannu,
> > >
> > > On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
> > > >
> > > > Attached is a patch that adds the ability to dump table data in multiple chunks.
> > > >
> > > > Looking for feedback at this point:
> > > >  1) what have I missed
> > > >  2) should I implement something to avoid single-page chunks
> > > >
> > > > The flag --huge-table-chunk-pages which tells the directory format
> > > > dump to dump tables where the main fork has more pages than this in
> > > > multiple chunks of given number of pages,
> > > >
> > > > The main use case is speeding up parallel dumps in case of one or a
> > > > small number of HUGE tables so parts of these can be dumped in
> > > > parallel.
> > >
> > > Have you measured speed up? Can you please share the numbers?
> > >
> > > --
> > > Best Wishes,
> > > Ashutosh Bapat

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

13 ноября, 23:26:46

The reason for small chunk sizes is that they are determined by main
heap table, and that was just over 1GB

largetoastdb=> SELECT format('%I.%I', t.schemaname, t.relname) as table_name,
       pg_table_size(t.relid) AS table_size,
       sum(pg_relation_size(i.indexrelid)) AS total_index_size,
       pg_relation_size(t.relid) AS main_table_size,
       pg_relation_size(c.reltoastrelid) AS toast_table_size,
       pg_relation_size(oi.indexrelid) AS toast_index_size,
       t.n_live_tup AS row_count,
       count(*) AS index_count,
       array_to_json(array_agg(json_build_object(i.indexrelid::regclass,
pg_relation_size(i.indexrelid))), true) AS index_info
  FROM pg_stat_user_tables t
  JOIN pg_stat_user_indexes i ON i.relid = t.relid
  JOIN pg_class c ON c.oid = t.relid
  LEFT JOIN pg_stat_sys_indexes AS oi ON oi.relid = c.reltoastrelid
 GROUP BY 1, 2, 4, 5, 6, 7
 ORDER BY 2 DESC, 7 DESC
 LIMIT 25;
┌─[ RECORD 1 ]─────┬─────────────────────────────────────┐
│ table_name       │ public.just_toasted                 │
│ table_size       │ 56718835712                         │
│ total_index_size │ 230064128                           │
│ main_table_size  │ 1191559168                          │
│ toast_table_size │ 54613336064                         │
│ toast_index_size │ 898465792                           │
│ row_count        │ 5625234                             │
│ index_count      │ 1                                   │
│ index_info       │ [{"just_toasted_pkey" : 230064128}] │
└──────────────────┴─────────────────────────────────────┘

On Thu, Nov 13, 2025 at 9:24 PM Hannu Krosing <hannuk@google.com> wrote:
>
> Ran another test with a 53GB database where most of the data is in TOAST
>
> CREATE TABLE just_toasted(
>   id serial primary key,
>   toasted1 char(2200) STORAGE EXTERNAL,
>   toasted2 char(2200) STORAGE EXTERNAL,
>   toasted3 char(2200) STORAGE EXTERNAL,
>   toasted4 char(2200) STORAGE EXTERNAL
> );
>
> and the toast fields were added in somewhat randomised order.
>
> Here the results are as follows
>
> Parallelism   |   chunk size (pages)   |    time (sec)
>  1        |    -         |     240
>  2        |  1000    |     129
>  4        |  1000    |      64
>  8        |  1000    |      36
> 16       |  1000    |      30
>
>  4        |  9095    |      78
>  8        |  9095    |      42
> 16       |  9095    |      42
>
> The reason larger chunk sizes performed worse was that they often had
> one or two stragglers left behind which
>
> Detailed run results below:
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres -f
> /tmp/ltoastdb-1-plain.dump largetoastdb
> real    3m59.465s
> user    3m43.304s
> sys     0m15.844s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=9095 -j 4 -f /tmp/ltoastdb-4.dump
> largetoastdb
> real    1m18.320s
> user    3m49.236s
> sys     0m19.422s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=9095 -j 8 -f /tmp/ltoastdb-8.dump
> largetoastdb
> real    0m42.028s
> user    3m55.299s
> sys     0m24.657s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=9095 -j 16 -f /tmp/ltoastdb-16.dump
> largetoastdb
> real    0m42.575s
> user    4m11.011s
> sys     0m26.110s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=1000 -j 16 -f /tmp/ltoastdb-16-1kpages.dump
> largetoastdb
> real    0m29.641s
> user    6m16.321s
> sys     0m49.345s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=1000 -j 8 -f /tmp/ltoastdb-8-1kpages.dump
> largetoastdb
> real    0m35.685s
> user    3m58.528s
> sys     0m26.729s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=1000 -j 4 -f /tmp/ltoastdb-4-1kpages.dump
> largetoastdb
> real    1m3.737s
> user    3m50.251s
> sys     0m18.507s
>
> hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> --format=directory -h 10.58.80.2 -U postgres
> --huge-table-chunk-pages=1000 -j 2 -f /tmp/ltoastdb-2-1kpages.dump
> largetoastdb
> real    2m8.708s
> user    3m57.018s
> sys     0m18.499s
>
> On Thu, Nov 13, 2025 at 7:39 PM Hannu Krosing <hannuk@google.com> wrote:
> >
> > Going up to 16 workers did not improve performance , but this is
> > expected, as the disk behind the database can only do 4TB/hour of
> > reads, which is now the bottleneck. (408/352/*3600 = 4172 GB/h)
> >
> > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=131072 -j 16 -f /tmp/parallel16.dump largedb
> > real    5m44.900s
> > user    53m50.491s
> > sys     5m47.602s
> >
> > And 4 workers showed near-linear speedup from single worker
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=131072 -j 4 -f /tmp/parallel4.dump largedb
> > real    10m32.074s
> > user    38m54.436s
> > sys     2m58.216s
> >
> > The database runs on a 64vCPU VM with 128GB RAM, so most of the table
> > will be read in from the disk
> >
> >
> >
> >
> >
> >
> > On Thu, Nov 13, 2025 at 7:02 PM Hannu Krosing <hannuk@google.com> wrote:
> > >
> > > I just ran a test by generating a 408GB table and then dumping it both ways
> > >
> > > $ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
> > > /tmp/plain.dump largedb
> > >
> > > real    39m54.968s
> > > user    37m21.557s
> > > sys     2m32.422s
> > >
> > > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> > > --huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
> > >
> > > real    5m52.965s
> > > user    40m27.284s
> > > sys     3m53.339s
> > >
> > > So parallel dump with 8 workers using 1GB (128k pages) chunks runs
> > > almost 7 times faster than the sequential dump.
> > >
> > > this was a table that had no TOAST part. I will run some more tests
> > > with TOASTed tables next and expect similar or better improvements.
> > >
> > >
> > >
> > > On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > Hi Hannu,
> > > >
> > > > On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
> > > > >
> > > > > Attached is a patch that adds the ability to dump table data in multiple chunks.
> > > > >
> > > > > Looking for feedback at this point:
> > > > >  1) what have I missed
> > > > >  2) should I implement something to avoid single-page chunks
> > > > >
> > > > > The flag --huge-table-chunk-pages which tells the directory format
> > > > > dump to dump tables where the main fork has more pages than this in
> > > > > multiple chunks of given number of pages,
> > > > >
> > > > > The main use case is speeding up parallel dumps in case of one or a
> > > > > small number of HUGE tables so parts of these can be dumped in
> > > > > parallel.
> > > >
> > > > Have you measured speed up? Can you please share the numbers?
> > > >
> > > > --
> > > > Best Wishes,
> > > > Ashutosh Bapat

Re: Patch: dumping tables data in multiple chunks in pg_dump

От

Hannu Krosing

Дата:

13 ноября, 23:34:23

Added to https://commitfest.postgresql.org/patch/6219/

On Thu, Nov 13, 2025 at 9:26 PM Hannu Krosing <hannuk@google.com> wrote:
>
> The reason for small chunk sizes is that they are determined by main
> heap table, and that was just over 1GB
>
> largetoastdb=> SELECT format('%I.%I', t.schemaname, t.relname) as table_name,
>        pg_table_size(t.relid) AS table_size,
>        sum(pg_relation_size(i.indexrelid)) AS total_index_size,
>        pg_relation_size(t.relid) AS main_table_size,
>        pg_relation_size(c.reltoastrelid) AS toast_table_size,
>        pg_relation_size(oi.indexrelid) AS toast_index_size,
>        t.n_live_tup AS row_count,
>        count(*) AS index_count,
>        array_to_json(array_agg(json_build_object(i.indexrelid::regclass,
> pg_relation_size(i.indexrelid))), true) AS index_info
>   FROM pg_stat_user_tables t
>   JOIN pg_stat_user_indexes i ON i.relid = t.relid
>   JOIN pg_class c ON c.oid = t.relid
>   LEFT JOIN pg_stat_sys_indexes AS oi ON oi.relid = c.reltoastrelid
>  GROUP BY 1, 2, 4, 5, 6, 7
>  ORDER BY 2 DESC, 7 DESC
>  LIMIT 25;
> ┌─[ RECORD 1 ]─────┬─────────────────────────────────────┐
> │ table_name       │ public.just_toasted                 │
> │ table_size       │ 56718835712                         │
> │ total_index_size │ 230064128                           │
> │ main_table_size  │ 1191559168                          │
> │ toast_table_size │ 54613336064                         │
> │ toast_index_size │ 898465792                           │
> │ row_count        │ 5625234                             │
> │ index_count      │ 1                                   │
> │ index_info       │ [{"just_toasted_pkey" : 230064128}] │
> └──────────────────┴─────────────────────────────────────┘
>
> On Thu, Nov 13, 2025 at 9:24 PM Hannu Krosing <hannuk@google.com> wrote:
> >
> > Ran another test with a 53GB database where most of the data is in TOAST
> >
> > CREATE TABLE just_toasted(
> >   id serial primary key,
> >   toasted1 char(2200) STORAGE EXTERNAL,
> >   toasted2 char(2200) STORAGE EXTERNAL,
> >   toasted3 char(2200) STORAGE EXTERNAL,
> >   toasted4 char(2200) STORAGE EXTERNAL
> > );
> >
> > and the toast fields were added in somewhat randomised order.
> >
> > Here the results are as follows
> >
> > Parallelism   |   chunk size (pages)   |    time (sec)
> >  1        |    -         |     240
> >  2        |  1000    |     129
> >  4        |  1000    |      64
> >  8        |  1000    |      36
> > 16       |  1000    |      30
> >
> >  4        |  9095    |      78
> >  8        |  9095    |      42
> > 16       |  9095    |      42
> >
> > The reason larger chunk sizes performed worse was that they often had
> > one or two stragglers left behind which
> >
> > Detailed run results below:
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres -f
> > /tmp/ltoastdb-1-plain.dump largetoastdb
> > real    3m59.465s
> > user    3m43.304s
> > sys     0m15.844s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=9095 -j 4 -f /tmp/ltoastdb-4.dump
> > largetoastdb
> > real    1m18.320s
> > user    3m49.236s
> > sys     0m19.422s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=9095 -j 8 -f /tmp/ltoastdb-8.dump
> > largetoastdb
> > real    0m42.028s
> > user    3m55.299s
> > sys     0m24.657s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=9095 -j 16 -f /tmp/ltoastdb-16.dump
> > largetoastdb
> > real    0m42.575s
> > user    4m11.011s
> > sys     0m26.110s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=1000 -j 16 -f /tmp/ltoastdb-16-1kpages.dump
> > largetoastdb
> > real    0m29.641s
> > user    6m16.321s
> > sys     0m49.345s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=1000 -j 8 -f /tmp/ltoastdb-8-1kpages.dump
> > largetoastdb
> > real    0m35.685s
> > user    3m58.528s
> > sys     0m26.729s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=1000 -j 4 -f /tmp/ltoastdb-4-1kpages.dump
> > largetoastdb
> > real    1m3.737s
> > user    3m50.251s
> > sys     0m18.507s
> >
> > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > --format=directory -h 10.58.80.2 -U postgres
> > --huge-table-chunk-pages=1000 -j 2 -f /tmp/ltoastdb-2-1kpages.dump
> > largetoastdb
> > real    2m8.708s
> > user    3m57.018s
> > sys     0m18.499s
> >
> > On Thu, Nov 13, 2025 at 7:39 PM Hannu Krosing <hannuk@google.com> wrote:
> > >
> > > Going up to 16 workers did not improve performance , but this is
> > > expected, as the disk behind the database can only do 4TB/hour of
> > > reads, which is now the bottleneck. (408/352/*3600 = 4172 GB/h)
> > >
> > > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> > > --huge-table-chunk-pages=131072 -j 16 -f /tmp/parallel16.dump largedb
> > > real    5m44.900s
> > > user    53m50.491s
> > > sys     5m47.602s
> > >
> > > And 4 workers showed near-linear speedup from single worker
> > >
> > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
> > > --format=directory -h 10.58.80.2 -U postgres
> > > --huge-table-chunk-pages=131072 -j 4 -f /tmp/parallel4.dump largedb
> > > real    10m32.074s
> > > user    38m54.436s
> > > sys     2m58.216s
> > >
> > > The database runs on a 64vCPU VM with 128GB RAM, so most of the table
> > > will be read in from the disk
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Nov 13, 2025 at 7:02 PM Hannu Krosing <hannuk@google.com> wrote:
> > > >
> > > > I just ran a test by generating a 408GB table and then dumping it both ways
> > > >
> > > > $ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
> > > > /tmp/plain.dump largedb
> > > >
> > > > real    39m54.968s
> > > > user    37m21.557s
> > > > sys     2m32.422s
> > > >
> > > > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
> > > > --huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
> > > >
> > > > real    5m52.965s
> > > > user    40m27.284s
> > > > sys     3m53.339s
> > > >
> > > > So parallel dump with 8 workers using 1GB (128k pages) chunks runs
> > > > almost 7 times faster than the sequential dump.
> > > >
> > > > this was a table that had no TOAST part. I will run some more tests
> > > > with TOASTed tables next and expect similar or better improvements.
> > > >
> > > >
> > > >
> > > > On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
> > > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > > >
> > > > > Hi Hannu,
> > > > >
> > > > > On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
> > > > > >
> > > > > > Attached is a patch that adds the ability to dump table data in multiple chunks.
> > > > > >
> > > > > > Looking for feedback at this point:
> > > > > >  1) what have I missed
> > > > > >  2) should I implement something to avoid single-page chunks
> > > > > >
> > > > > > The flag --huge-table-chunk-pages which tells the directory format
> > > > > > dump to dump tables where the main fork has more pages than this in
> > > > > > multiple chunks of given number of pages,
> > > > > >
> > > > > > The main use case is speeding up parallel dumps in case of one or a
> > > > > > small number of HUGE tables so parts of these can be dumped in
> > > > > > parallel.
> > > > >
> > > > > Have you measured speed up? Can you please share the numbers?
> > > > >
> > > > > --
> > > > > Best Wishes,
> > > > > Ashutosh Bapat

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Patch: dumping tables data in multiple chunks in pg_dump

Patch: dumping tables data in multiple chunks in pg_dump

Вложения

Re: Patch: dumping tables data in multiple chunks in pg_dump

Re: Patch: dumping tables data in multiple chunks in pg_dump

Re: Patch: dumping tables data in multiple chunks in pg_dump

Re: Patch: dumping tables data in multiple chunks in pg_dump

Re: Patch: dumping tables data in multiple chunks in pg_dump

Re: Patch: dumping tables data in multiple chunks in pg_dump