Обсуждение: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records

Поиск
Список
Период
Сортировка
Greetings!

One of our clients experienced a crash of startup process with an error "invalid memory alloc request size 1073741824" on a hot standby, which ended in replica reinit.

According to logs, startup process crashed while trying to replay "Standby/LOCK" record with a huge list of locks(see attached replicalog_tail.tgz):

FATAL:  XX000: invalid memory alloc request size 1073741824
CONTEXT:  WAL redo at 7/327F9248 for Standby/LOCK: xid 1638575 db 7550635 rel 8500880 xid 1638575 db 7550635 rel 10324499...
LOCATION:  repalloc, mcxt.c:1075
BACKTRACE:  
        postgres: startup recovering 000000010000000700000033(repalloc+0x61) [0x8d7611]
        postgres: startup recovering 000000010000000700000033() [0x691c29]
        postgres: startup recovering 000000010000000700000033() [0x691c74]
        postgres: startup recovering 000000010000000700000033(lappend+0x16) [0x691e76]
        postgres: startup recovering 000000010000000700000033(StandbyAcquireAccessExclusiveLock+0xdd) [0x7786bd]
        postgres: startup recovering 000000010000000700000033(standby_redo+0x5d) [0x7789ed]
        postgres: startup recovering 000000010000000700000033(StartupXLOG+0x1055) [0x51d7c5]
        postgres: startup recovering 000000010000000700000033(StartupProcessMain+0xcd) [0x71d65d]
        postgres: startup recovering 000000010000000700000033(AuxiliaryProcessMain+0x40c) [0x52c7cc]
        postgres: startup recovering 000000010000000700000033() [0x71a62e]
        postgres: startup recovering 000000010000000700000033(PostmasterMain+0xe74) [0x71cf74]
        postgres: startup recovering 000000010000000700000033(main+0x70d) [0x4891ad]
        /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f0be2db3555]
LOG:  00000: startup process (PID 2650) exited with exit code 1

Looks like startup process at some point hits the MaxAllocSize limit(memutils.h), which forbids allocation of more than 1gb-1 bytes.

Judging by pg_waldump output, there was long running transaction on primary, that sequentially locked and modified a lot of tables. Right before the crash there was about 85k exclusively locked tables.

Trying to reproduce the issue, I found out that the problem is not so much in the number of locks, but in the duration of the transaction. Replaying "Standby/LOCK" records, the startup process eventually crashes with the mentioned error if a long transaction holds a large number of locks long enough.

I managed to reproduce the situation on 13.7, 14.4, 15beta3 and master using the following steps:
1) get primary and replica with the following settings: max_locks_per_transaction = '10000' and max_connections = '1000'
2) create 950k tables
3) lock them in AccessExclusive mode in transaction and leave it in "idle in transaction state"
4) make some side activity with pgbench (pgbench -R 100 -P 5 -T 7200 -c 1)
In about 20-30 minutes startup process crashes with the same error.

As far as I understand, there is fixed amount of AccessExclusive locks in this scenario. 950k exclusive locks acquired by "long running" transaction and no additional exclusive locks held by pgbench. But startup consumes more and more memory while replaying records, that contain exacly the same list of locks. Could it be a memory leak? If not, is there any way to improve this behavior?

If you're going to reproduce it, get primary and replica with enough RAM and simultaneously run on primary:
$ pgbench -i && pgbench -R 100 -P 5 -T 7200 -c 1 in one terminal
$ psql -f 950k_locks.sql in another terminal
and observe startup memory usage and replica's logs.

Best regards,
Dmitry Kuzmin
Вложения
On Mon, 5 Sept 2022 at 22:38, Dmitriy Kuzmin <kuzmin.db4@gmail.com> wrote:
> One of our clients experienced a crash of startup process with an error "invalid memory alloc request size
1073741824"on a hot standby, which ended in replica reinit.
 
>
> According to logs, startup process crashed while trying to replay "Standby/LOCK" record with a huge list of locks(see
attachedreplicalog_tail.tgz):
 
>
> FATAL:  XX000: invalid memory alloc request size 1073741824
> CONTEXT:  WAL redo at 7/327F9248 for Standby/LOCK: xid 1638575 db 7550635 rel 8500880 xid 1638575 db 7550635 rel
10324499...
> LOCATION:  repalloc, mcxt.c:1075
> BACKTRACE:
>         postgres: startup recovering 000000010000000700000033(repalloc+0x61) [0x8d7611]
>         postgres: startup recovering 000000010000000700000033() [0x691c29]
>         postgres: startup recovering 000000010000000700000033() [0x691c74]
>         postgres: startup recovering 000000010000000700000033(lappend+0x16) [0x691e76]

This must be the repalloc() in enlarge_list(). 1073741824 / 8 is
134,217,728 (2^27). That's quite a bit more than 1 lock per your 950k
tables.

I wonder why the RecoveryLockListsEntry.locks list is getting so long.

from the file you attached, I see:
$ cat replicalog_tail | grep -Eio "rel\s([0-9]+)" | wc -l
950000

So that confirms there were 950k relations in the xl_standby_locks.
The contents of that message seem to be produced by standby_desc().
That should be the same WAL record that's processed by standby_redo()
which adds the 950k locks to the RecoveryLockListsEntry.

I'm not seeing why 950k becomes 134m.

David



Thanks, David!

Let me know if there's any additional information i could provide.

Best regards,
Dmitry Kuzmin

пн, 5 сент. 2022 г. в 22:13, David Rowley <dgrowleyml@gmail.com>:
On Mon, 5 Sept 2022 at 22:38, Dmitriy Kuzmin <kuzmin.db4@gmail.com> wrote:
> One of our clients experienced a crash of startup process with an error "invalid memory alloc request size 1073741824" on a hot standby, which ended in replica reinit.
>
> According to logs, startup process crashed while trying to replay "Standby/LOCK" record with a huge list of locks(see attached replicalog_tail.tgz):
>
> FATAL:  XX000: invalid memory alloc request size 1073741824
> CONTEXT:  WAL redo at 7/327F9248 for Standby/LOCK: xid 1638575 db 7550635 rel 8500880 xid 1638575 db 7550635 rel 10324499...
> LOCATION:  repalloc, mcxt.c:1075
> BACKTRACE:
>         postgres: startup recovering 000000010000000700000033(repalloc+0x61) [0x8d7611]
>         postgres: startup recovering 000000010000000700000033() [0x691c29]
>         postgres: startup recovering 000000010000000700000033() [0x691c74]
>         postgres: startup recovering 000000010000000700000033(lappend+0x16) [0x691e76]

This must be the repalloc() in enlarge_list(). 1073741824 / 8 is
134,217,728 (2^27). That's quite a bit more than 1 lock per your 950k
tables.

I wonder why the RecoveryLockListsEntry.locks list is getting so long.

from the file you attached, I see:
$ cat replicalog_tail | grep -Eio "rel\s([0-9]+)" | wc -l
950000

So that confirms there were 950k relations in the xl_standby_locks.
The contents of that message seem to be produced by standby_desc().
That should be the same WAL record that's processed by standby_redo()
which adds the 950k locks to the RecoveryLockListsEntry.

I'm not seeing why 950k becomes 134m.

David