Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CAD21AoDAKmUG6LFvbzWqEgvXZBgYoap2yL74N8QkVgzg1cW-Tg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
Ответы Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
Список pgsql-hackers
On Thu, Feb 23, 2023 at 6:41 PM John Naylor
<john.naylor@enterprisedb.com> wrote:
>
> I ran a couple "in situ" tests on server hardware using UUID columns, since they are common in the real world and
havebad correlation to heap order, so are a challenge for index vacuum. 

Thank you for the test!

>
> === test 1, delete everything from a small table, with very small maintenance_work_mem:
>
> alter system set shared_buffers ='4GB';
> alter system set max_wal_size ='10GB';
> alter system set checkpoint_timeout ='30 min';
> alter system set autovacuum =off;
>
> -- unrealistically low
> alter system set maintenance_work_mem = '32MB';
>
> create table if not exists test (x uuid);
> truncate table test;
> insert into test (x) select gen_random_uuid() from generate_series(1,50*1000*1000);
> create index on test (x);
>
> delete from test;
> vacuum (verbose, truncate off) test;
> --
>
> master:
> INFO:  finished vacuuming "john.naylor.public.test": index scans: 9
> system usage: CPU: user: 70.04 s, system: 19.85 s, elapsed: 802.06 s
>
> v29 patch:
> INFO:  finished vacuuming "john.naylor.public.test": index scans: 1
> system usage: CPU: user: 9.80 s, system: 2.62 s, elapsed: 36.68 s
>
> This is a bit artificial, but it's easy to construct cases where the array leads to multiple index scans but the new
tidstore can fit everythin without breaking a sweat. I didn't save the progress reporting, but v29 was using about 11MB
fortid storage. 

Cool.

>
>
> === test 2: try to stress tid lookup with production maintenance_work_mem:
> 1. use unlogged table to reduce noise
> 2. vacuum freeze first to reduce heap scan time
> 3. delete some records at the beginning and end of heap to defeat binary search's pre-check
>
> alter system set shared_buffers ='4GB';
> alter system set max_wal_size ='10GB';
> alter system set checkpoint_timeout ='30 min';
> alter system set autovacuum =off;
>
> alter system set maintenance_work_mem = '1GB';
>
> create unlogged table if not exists test (x uuid);
> truncate table test;
> insert into test (x) select gen_random_uuid() from generate_series(1,1000*1000*1000);
> vacuum_freeze test;
>
> select pg_size_pretty(pg_table_size('test'));
>  pg_size_pretty
> ----------------
>  41 GB
>
> create index on test (x);
>
> select pg_size_pretty(pg_total_relation_size('test'));
>  pg_size_pretty
> ----------------
>  71 GB
>
> select max(ctid) from test;
>      max
> --------------
>  (5405405,75)
>
> delete from test where ctid <  '(100000,0)'::tid;
> delete from test where ctid > '(5300000,0)'::tid;
>
> vacuum (verbose, truncate off) test;
>
> both:
> INFO:  vacuuming "john.naylor.public.test"
> INFO:  finished vacuuming "john.naylor.public.test": index scans: 1
> index scan needed: 205406 pages from table (3.80% of total) had 38000000 dead item identifiers removed
>
> --
> master:
> system usage: CPU: user: 134.32 s, system: 19.24 s, elapsed: 286.14 s
>
> v29 patch:
> system usage: CPU: user:  97.71 s, system: 45.78 s, elapsed: 573.94 s

In v29 vacuum took twice as long (286 s vs. 573 s)?

>
> The entire vacuum took 25% less wall clock time. Reminder that this is without wal logging, and also unscientific
becauseonly one run. 
>
> --
> I took 10 seconds of perf data while index vacuuming was going on (showing calls > 2%):
>
> master:
>   40.59%  postgres  postgres            [.] vac_cmp_itemptr
>   24.97%  postgres  libc-2.17.so        [.] bsearch
>    6.67%  postgres  postgres            [.] btvacuumpage
>    4.61%  postgres  [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
>    3.48%  postgres  postgres            [.] PageIndexMultiDelete
>    2.67%  postgres  postgres            [.] vac_tid_reaped
>    2.03%  postgres  postgres            [.] compactify_tuples
>    2.01%  postgres  libc-2.17.so        [.] __memcpy_ssse3_back
>
> v29 patch:
>
>   29.22%  postgres  postgres            [.] TidStoreIsMember
>    9.30%  postgres  postgres            [.] btvacuumpage
>    7.76%  postgres  postgres            [.] PageIndexMultiDelete
>    6.31%  postgres  [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
>    5.60%  postgres  postgres            [.] compactify_tuples
>    4.26%  postgres  libc-2.17.so        [.] __memcpy_ssse3_back
>    4.12%  postgres  postgres            [.] hash_search_with_hash_value
>
> --
> master:
> psql -c "select phase, heap_blks_total, heap_blks_scanned, max_dead_tuples, num_dead_tuples from
pg_stat_progress_vacuum"
>        phase       | heap_blks_total | heap_blks_scanned | max_dead_tuples | num_dead_tuples
> -------------------+-----------------+-------------------+-----------------+-----------------
>  vacuuming indexes |         5405406 |           5405406 |       178956969 |        38000000
>
> v29 patch:
> psql  -c "select phase, heap_blks_total, heap_blks_scanned, max_dead_tuple_bytes, dead_tuple_bytes from
pg_stat_progress_vacuum"
>        phase       | heap_blks_total | heap_blks_scanned | max_dead_tuple_bytes | dead_tuple_bytes
> -------------------+-----------------+-------------------+----------------------+------------------
>  vacuuming indexes |         5405406 |           5405406 |           1073670144 |          8678064
>
> Here, the old array pessimistically needs 1GB allocated (as for any table > ~5GB), but only fills 228MB for tid
lookup.The patch reports 8.7MB. Tables that only fit, say, 30-50 tuples per page will have less extreme differences in
memoryuse. Same for the case where only a couple dead items occur per page, with many uninteresting pages in between.
Evenso, the allocation will be much more accurately sized in the patch, especially in non-parallel vacuum. 

Agreed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: PATCH: Using BRIN indexes for sorted output
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Support logical replication of DDLs