Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
От | Tom Lane |
---|---|
Тема | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Дата | |
Msg-id | 863353.1760409885@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward (Chao Li <li.evan.chao@gmail.com>) |
Ответы |
Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
|
Список | pgsql-hackers |
Chao Li <li.evan.chao@gmail.com> writes: >> On Oct 14, 2025, at 08:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The thing we are really interested in here is how fast pg_restore >> can skip over unwanted table data in a large archive file, and that >> I believe should be pretty sensitive to block size. > Not sure if I did something wrong, but I still don’t see much difference between buffer size 4K and 128K with your suggestedtest. > > % time pg_dump -Fc -f db.dump evantest This won't show the effect, because pg_dump will be able to go back and insert data offsets into the dump's TOC, so pg_restore can just seek to where the data is. See upthread discussion about what's needed to provoke Dimitrios' problem. I tried this very tiny (relatively speaking) test case: regression=# create database d1; CREATE DATABASE regression=# \c d1 You are now connected to database "d1" as user "postgres". d1=# create table alpha as select repeat(random()::text, 1000) from generate_series(1,1000000); SELECT 1000000 d1=# create table omega as select 42 as x; SELECT 1 d1=# \q Then $ pg_dump -Fc d1 | cat >d1.dump $ time pg_restore -f /dev/null -t omega d1.dump The point of the pipe-to-cat is to reproduce Dimitrios' problem case with no data offsets in the TOC. Then the restore is doing about the simplest thing I can think of to make it skip over most of the archive file. Also, I'm intentionally using the default choice of gzip because that already responds to DEFAULT_IO_BUFFER_SIZE properly. (This test is with current HEAD, no patches except adjusting DEFAULT_IO_BUFFER_SIZE.) I got these timings: DEFAULT_IO_BUFFER_SIZE = 1K real 0m0.020s user 0m0.002s sys 0m0.017s DEFAULT_IO_BUFFER_SIZE = 4K real 0m0.014s user 0m0.003s sys 0m0.011s DEFAULT_IO_BUFFER_SIZE = 128K real 0m0.002s user 0m0.000s sys 0m0.002s This test case has only about 50MB worth of compressed data, so of course the times are very small; scaling it up to gigabytes would yield more impressive results. But the effect is clearly visible. regards, tom lane
В списке pgsql-hackers по дате отправления: