Re: [PoC] Improve dead tuple storage for lazy vacuum
От | Masahiko Sawada |
---|---|
Тема | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Дата | |
Msg-id | CAD21AoDAKmUG6LFvbzWqEgvXZBgYoap2yL74N8QkVgzg1cW-Tg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>) |
Ответы |
Re: [PoC] Improve dead tuple storage for lazy vacuum
(John Naylor <john.naylor@enterprisedb.com>)
|
Список | pgsql-hackers |
On Thu, Feb 23, 2023 at 6:41 PM John Naylor <john.naylor@enterprisedb.com> wrote: > > I ran a couple "in situ" tests on server hardware using UUID columns, since they are common in the real world and havebad correlation to heap order, so are a challenge for index vacuum. Thank you for the test! > > === test 1, delete everything from a small table, with very small maintenance_work_mem: > > alter system set shared_buffers ='4GB'; > alter system set max_wal_size ='10GB'; > alter system set checkpoint_timeout ='30 min'; > alter system set autovacuum =off; > > -- unrealistically low > alter system set maintenance_work_mem = '32MB'; > > create table if not exists test (x uuid); > truncate table test; > insert into test (x) select gen_random_uuid() from generate_series(1,50*1000*1000); > create index on test (x); > > delete from test; > vacuum (verbose, truncate off) test; > -- > > master: > INFO: finished vacuuming "john.naylor.public.test": index scans: 9 > system usage: CPU: user: 70.04 s, system: 19.85 s, elapsed: 802.06 s > > v29 patch: > INFO: finished vacuuming "john.naylor.public.test": index scans: 1 > system usage: CPU: user: 9.80 s, system: 2.62 s, elapsed: 36.68 s > > This is a bit artificial, but it's easy to construct cases where the array leads to multiple index scans but the new tidstore can fit everythin without breaking a sweat. I didn't save the progress reporting, but v29 was using about 11MB fortid storage. Cool. > > > === test 2: try to stress tid lookup with production maintenance_work_mem: > 1. use unlogged table to reduce noise > 2. vacuum freeze first to reduce heap scan time > 3. delete some records at the beginning and end of heap to defeat binary search's pre-check > > alter system set shared_buffers ='4GB'; > alter system set max_wal_size ='10GB'; > alter system set checkpoint_timeout ='30 min'; > alter system set autovacuum =off; > > alter system set maintenance_work_mem = '1GB'; > > create unlogged table if not exists test (x uuid); > truncate table test; > insert into test (x) select gen_random_uuid() from generate_series(1,1000*1000*1000); > vacuum_freeze test; > > select pg_size_pretty(pg_table_size('test')); > pg_size_pretty > ---------------- > 41 GB > > create index on test (x); > > select pg_size_pretty(pg_total_relation_size('test')); > pg_size_pretty > ---------------- > 71 GB > > select max(ctid) from test; > max > -------------- > (5405405,75) > > delete from test where ctid < '(100000,0)'::tid; > delete from test where ctid > '(5300000,0)'::tid; > > vacuum (verbose, truncate off) test; > > both: > INFO: vacuuming "john.naylor.public.test" > INFO: finished vacuuming "john.naylor.public.test": index scans: 1 > index scan needed: 205406 pages from table (3.80% of total) had 38000000 dead item identifiers removed > > -- > master: > system usage: CPU: user: 134.32 s, system: 19.24 s, elapsed: 286.14 s > > v29 patch: > system usage: CPU: user: 97.71 s, system: 45.78 s, elapsed: 573.94 s In v29 vacuum took twice as long (286 s vs. 573 s)? > > The entire vacuum took 25% less wall clock time. Reminder that this is without wal logging, and also unscientific becauseonly one run. > > -- > I took 10 seconds of perf data while index vacuuming was going on (showing calls > 2%): > > master: > 40.59% postgres postgres [.] vac_cmp_itemptr > 24.97% postgres libc-2.17.so [.] bsearch > 6.67% postgres postgres [.] btvacuumpage > 4.61% postgres [kernel.kallsyms] [k] copy_user_enhanced_fast_string > 3.48% postgres postgres [.] PageIndexMultiDelete > 2.67% postgres postgres [.] vac_tid_reaped > 2.03% postgres postgres [.] compactify_tuples > 2.01% postgres libc-2.17.so [.] __memcpy_ssse3_back > > v29 patch: > > 29.22% postgres postgres [.] TidStoreIsMember > 9.30% postgres postgres [.] btvacuumpage > 7.76% postgres postgres [.] PageIndexMultiDelete > 6.31% postgres [kernel.kallsyms] [k] copy_user_enhanced_fast_string > 5.60% postgres postgres [.] compactify_tuples > 4.26% postgres libc-2.17.so [.] __memcpy_ssse3_back > 4.12% postgres postgres [.] hash_search_with_hash_value > > -- > master: > psql -c "select phase, heap_blks_total, heap_blks_scanned, max_dead_tuples, num_dead_tuples from pg_stat_progress_vacuum" > phase | heap_blks_total | heap_blks_scanned | max_dead_tuples | num_dead_tuples > -------------------+-----------------+-------------------+-----------------+----------------- > vacuuming indexes | 5405406 | 5405406 | 178956969 | 38000000 > > v29 patch: > psql -c "select phase, heap_blks_total, heap_blks_scanned, max_dead_tuple_bytes, dead_tuple_bytes from pg_stat_progress_vacuum" > phase | heap_blks_total | heap_blks_scanned | max_dead_tuple_bytes | dead_tuple_bytes > -------------------+-----------------+-------------------+----------------------+------------------ > vacuuming indexes | 5405406 | 5405406 | 1073670144 | 8678064 > > Here, the old array pessimistically needs 1GB allocated (as for any table > ~5GB), but only fills 228MB for tid lookup.The patch reports 8.7MB. Tables that only fit, say, 30-50 tuples per page will have less extreme differences in memoryuse. Same for the case where only a couple dead items occur per page, with many uninteresting pages in between. Evenso, the allocation will be much more accurately sized in the patch, especially in non-parallel vacuum. Agreed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: