Re: refactoring basebackup.c
От | tushar |
---|---|
Тема | Re: refactoring basebackup.c |
Дата | |
Msg-id | 6c3f1558-1e56-9946-78a2-c59340da1dbf@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: refactoring basebackup.c (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>) |
Список | pgsql-hackers |
On 1/18/22 8:12 PM, Jeevan Ladhe wrote: > Similar to LZ4 server-side compression, I have also tried to add a ZSTD > server-side compression in the attached patch. Thanks Jeevan. while testing found one scenario where the server is getting crash while performing pg_basebackup against server-compression=zstd for a huge data second time Steps to reproduce --PG sources ( apply v11-0001,v11-0001,v9-0001,v9-0002 , configure --with-lz4,--with-zstd, make/install, initdb, start server) --insert huge data (./pgbench -i -s 2000 postgres) --restart the server (./pg_ctl -D data restart) --pg_basebackup ( ./pg_basebackup -t server:/tmp/yc1 --server-compression=zstd -R -Xnone -n -N -l 'ccc' --no-estimate-size -v) --insert huge data (./pgbench -i -s 1000 postgres) --restart the server (./pg_ctl -D data restart) --run pg_basebackup again (./pg_basebackup -t server:/tmp/yc11 --server-compression=zstd -v -Xnone ) [edb@centos7tushar bin]$ ./pg_basebackup -t server:/tmp/yc11 --server-compression=zstd -v -Xnone pg_basebackup: initiating base backup, waiting for checkpoint to complete 2022-01-19 21:23:26.508 IST [30219] LOG: checkpoint starting: force wait 2022-01-19 21:23:26.608 IST [30219] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 1 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.101 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16369 kB, estimate=16369 kB pg_basebackup: checkpoint completed TRAP: FailedAssertion("len > 0 && len <= sink->bbs_buffer_length", File: "../../../src/include/replication/basebackup_sink.h", Line: 208, PID: 30226) postgres: walsender edb [local] sending backup "pg_basebackup base backup"(ExceptionalCondition+0x7a)[0x94ceca] postgres: walsender edb [local] sending backup "pg_basebackup base backup"[0x7b9a08] postgres: walsender edb [local] sending backup "pg_basebackup base backup"[0x7b9be2] postgres: walsender edb [local] sending backup "pg_basebackup base backup"[0x7b5b30] postgres: walsender edb [local] sending backup "pg_basebackup base backup"(SendBaseBackup+0x563)[0x7b7053] postgres: walsender edb [local] sending backup "pg_basebackup base backup"(exec_replication_command+0x961)[0x7c9a41] postgres: walsender edb [local] sending backup "pg_basebackup base backup"(PostgresMain+0x92f)[0x81ca3f] postgres: walsender edb [local] sending backup "pg_basebackup base backup"[0x48e430] postgres: walsender edb [local] sending backup "pg_basebackup base backup"(PostmasterMain+0xfd2)[0x785702] postgres: walsender edb [local] sending backup "pg_basebackup base backup"(main+0x1c6)[0x48fb96] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f63642c8555] postgres: walsender edb [local] sending backup "pg_basebackup base backup"[0x48feb5] pg_basebackup: error: could not read COPY data: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2022-01-19 21:25:34.485 IST [30205] LOG: server process (PID 30226) was terminated by signal 6: Aborted 2022-01-19 21:25:34.485 IST [30205] DETAIL: Failed process was running: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, MANIFEST 'yes', TABLESPACE_MAP, TARGET 'server', TARGET_DETAIL '/tmp/yc11', COMPRESSION 'zstd') 2022-01-19 21:25:34.485 IST [30205] LOG: terminating any other active server processes [edb@centos7tushar bin]$ 2022-01-19 21:25:34.489 IST [30205] LOG: all server processes terminated; reinitializing 2022-01-19 21:25:34.536 IST [30228] LOG: database system was interrupted; last known up at 2022-01-19 21:23:26 IST 2022-01-19 21:25:34.669 IST [30228] LOG: database system was not properly shut down; automatic recovery in progress 2022-01-19 21:25:34.671 IST [30228] LOG: redo starts at 9/7000028 2022-01-19 21:25:34.671 IST [30228] LOG: invalid record length at 9/7000148: wanted 24, got 0 2022-01-19 21:25:34.671 IST [30228] LOG: redo done at 9/7000110 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s 2022-01-19 21:25:34.673 IST [30229] LOG: checkpoint starting: end-of-recovery immediate wait 2022-01-19 21:25:34.713 IST [30229] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.001 s, total=0.041 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB 2022-01-19 21:25:34.718 IST [30205] LOG: database system is ready to accept connections Observation - if we change server-compression method to lz4 from zstd then it is NOT happening. [edb@centos7tushar bin]$ ./pg_basebackup -t server:/tmp/ycc1 --server-compression=lz4 -v -Xnone pg_basebackup: initiating base backup, waiting for checkpoint to complete 2022-01-19 21:27:51.642 IST [30229] LOG: checkpoint starting: force wait 2022-01-19 21:27:51.687 IST [30229] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 1 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.046 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16383 kB, estimate=16383 kB pg_basebackup: checkpoint completed NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup pg_basebackup: base backup completed [edb@centos7tushar bin]$ -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: Replace uses of deprecated Python module distutils.sysconfig