Re: Slow standby snapshot

Поиск
Список
Период
Сортировка
От Kirill Reshke
Тема Re: Slow standby snapshot
Дата
Msg-id CALdSSPj5Be=y3gjvJ7FDrnR1_chejZCO-7EE0f40BvxfbAWCQg@mail.gmail.com
обсуждение исходный текст
Ответ на Slow standby snapshot  (Кирилл Решке <reshkekirill@gmail.com>)
Ответы Re: Slow standby snapshot  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Список pgsql-hackers
sorry, forgot to add a patch to the letter


чт, 20 мая 2021 г. в 13:52, Кирилл Решке <reshkekirill@gmail.com>:
Hi,
I recently ran into a problem in one of our production postgresql cluster. I had noticed lock contention on procarray lock on standby, which causes WAL replay lag growth.
To reproduce this, you can do the following:

1) set max_connections to big number, like 100000
2) begin a transaction on primary
3) start pgbench workload on primary and on standby

After a while it will be possible to see KnownAssignedXidsGetAndSetXmin in perf top consuming abount 75 % of CPU.

%%
  PerfTop:    1060 irqs/sec  kernel: 0.0%  exact:  0.0% [4000Hz cycles:u],  (target_pid: 273361)
-------------------------------------------------------------------------------

    73.92%  postgres       [.] KnownAssignedXidsGetAndSetXmin
     1.40%  postgres       [.] base_yyparse
     0.96%  postgres       [.] LWLockAttemptLock
     0.84%  postgres       [.] hash_search_with_hash_value
     0.84%  postgres       [.] AtEOXact_GUC
     0.72%  postgres       [.] ResetAllOptions
     0.70%  postgres       [.] AllocSetAlloc
     0.60%  postgres       [.] _bt_compare
     0.55%  postgres       [.] core_yylex
     0.42%  libc-2.27.so   [.] __strlen_avx2
     0.23%  postgres       [.] LWLockRelease
     0.19%  postgres       [.] MemoryContextAllocZeroAligned
     0.18%  postgres       [.] expression_tree_walker.part.3
     0.18%  libc-2.27.so   [.] __memmove_avx_unaligned_erms
     0.17%  postgres       [.] PostgresMain
     0.17%  postgres       [.] palloc
     0.17%  libc-2.27.so   [.] _int_malloc
     0.17%  postgres       [.] set_config_option
     0.17%  postgres       [.] ScanKeywordLookup
     0.16%  postgres       [.] _bt_checkpage

%%


We have tried to fix this by using BitMapSet instead of boolean array KnownAssignedXidsValid, but this does not help too much.

Instead, using a doubly linked list helps a little more, we got +1000 tps on pgbench workload with patched postgresql. The general idea of this patch is that, instead of memorizing which elements in KnownAssignedXids are valid, lets maintain a doubly linked list of them. This  solution will work in exactly the same way, except that taking a snapshot on the replica is now O(running transaction) instead of O(head - tail) which is significantly faster under some workloads. The patch helps to reduce CPU usage of KnownAssignedXidsGetAndSetXmin to ~48% instead of ~74%, but does eliminate it from perf top.

The problem is better reproduced on PG13 since PG14 has some snapshot optimization.

Thanks!

Best regards, reshke
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Inaccurate error message when set fdw batch_size to 0
Следующее
От: Amit Langote
Дата:
Сообщение: Re: Skip partition tuple routing with constant partition key