Re: Slow standby snapshot

Поиск

Список

Период

Сортировка

От	Kirill Reshke
Тема	Re: Slow standby snapshot
Дата	20 мая 2021 г. 12:16:39
Msg-id	CALdSSPj5Be=y3gjvJ7FDrnR1_chejZCO-7EE0f40BvxfbAWCQg@mail.gmail.com обсуждение исходный текст
Ответ на	Slow standby snapshot (Кирилл Решке <reshkekirill@gmail.com>)
Ответы	Re: Slow standby snapshot
Список	pgsql-hackers

Дерево обсуждения

sorry, forgot to add a patch to the letter

чт, 20 мая 2021 г. в 13:52, Кирилл Решке <reshkekirill@gmail.com>:

Hi,
I recently ran into a problem in one of our production postgresql cluster. I had noticed lock contention on procarray lock on standby, which causes WAL replay lag growth.
To reproduce this, you can do the following:

1) set max_connections to big number, like 100000
2) begin a transaction on primary
3) start pgbench workload on primary and on standby

After a while it will be possible to see KnownAssignedXidsGetAndSetXmin in perf top consuming abount 75 % of CPU.

%%
PerfTop: 1060 irqs/sec kernel: 0.0% exact: 0.0% [4000Hz cycles:u], (target_pid: 273361)
-------------------------------------------------------------------------------

73.92% postgres [.] KnownAssignedXidsGetAndSetXmin
1.40% postgres [.] base_yyparse
0.96% postgres [.] LWLockAttemptLock
0.84% postgres [.] hash_search_with_hash_value
0.84% postgres [.] AtEOXact_GUC
0.72% postgres [.] ResetAllOptions
0.70% postgres [.] AllocSetAlloc
0.60% postgres [.] _bt_compare
0.55% postgres [.] core_yylex
0.42% libc-2.27.so [.] __strlen_avx2
0.23% postgres [.] LWLockRelease
0.19% postgres [.] MemoryContextAllocZeroAligned
0.18% postgres [.] expression_tree_walker.part.3
0.18% libc-2.27.so [.] __memmove_avx_unaligned_erms
0.17% postgres [.] PostgresMain
0.17% postgres [.] palloc
0.17% libc-2.27.so [.] _int_malloc
0.17% postgres [.] set_config_option
0.17% postgres [.] ScanKeywordLookup
0.16% postgres [.] _bt_checkpage

%%

We have tried to fix this by using BitMapSet instead of boolean array KnownAssignedXidsValid, but this does not help too much.

Instead, using a doubly linked list helps a little more, we got +1000 tps on pgbench workload with patched postgresql. The general idea of this patch is that, instead of memorizing which elements in KnownAssignedXids are valid, lets maintain a doubly linked list of them. This solution will work in exactly the same way, except that taking a snapshot on the replica is now O(running transaction) instead of O(head - tail) which is significantly faster under some workloads. The patch helps to reduce CPU usage of KnownAssignedXidsGetAndSetXmin to ~48% instead of ~74%, but does eliminate it from perf top.

The problem is better reproduced on PG13 since PG14 has some snapshot optimization.

Thanks!

Best regards, reshke

Вложения

UseDoublyLinkedListInKnowAssingedXods.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Bharath Rupireddy
Дата: 20 мая 2021 г., 12:13:50
Сообщение: Re: Inaccurate error message when set fdw batch_size to 0

Следующее

От: Amit Langote
Дата: 20 мая 2021 г., 13:03:18
Сообщение: Re: Skip partition tuple routing with constant partition key

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Slow standby snapshot

Вложения

Предыдущее

Следующее