RE: Perform streaming logical transactions by background workers and parallel apply
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | RE: Perform streaming logical transactions by background workers and parallel apply |
Дата | |
Msg-id | OS0PR01MB57164DF9FC5366024A1952D594659@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Perform streaming logical transactions by background workers and parallel apply (Alexander Lakhin <exclusion@gmail.com>) |
Ответы |
Re: Perform streaming logical transactions by background workers and parallel apply
(Amit Kapila <amit.kapila16@gmail.com>)
Re: Perform streaming logical transactions by background workers and parallel apply (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Wednesday, April 26, 2023 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > Please look at a new anomaly that can be observed starting from 216a7848. > > The following script: > echo "CREATE SUBSCRIPTION testsub CONNECTION 'dbname=nodb' > PUBLICATION testpub WITH (connect = false); > ALTER SUBSCRIPTION testsub ENABLE;" | psql > > sleep 1 > rm $PGINST/lib/libpqwalreceiver.so > sleep 15 > pg_ctl -D "$PGDB" stop -m immediate > grep 'TRAP:' server.log > > Leads to multiple assertion failures: > CREATE SUBSCRIPTION > ALTER SUBSCRIPTION > waiting for server to shut down.... done > server stopped > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899323 > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899416 > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899427 > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899439 > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899538 > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899547 > > server.log contains: > 2023-04-26 11:00:58.797 MSK [2899300] LOG: database system is ready to > accept connections > 2023-04-26 11:00:58.821 MSK [2899416] ERROR: could not access file > "libpqwalreceiver": No such file or directory > TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", > Line: 4439, PID: 2899416 > postgres: logical replication apply worker for subscription 16385 > (ExceptionalCondition+0x69)[0x558b2ac06d41] > postgres: logical replication apply worker for subscription 16385 > (VirtualXactLockTableCleanup+0xa4)[0x558b2aa9fd74] > postgres: logical replication apply worker for subscription 16385 > (LockReleaseAll+0xbb)[0x558b2aa9fe7d] > postgres: logical replication apply worker for subscription 16385 > (+0x4588c6)[0x558b2aa2a8c6] > postgres: logical replication apply worker for subscription 16385 > (shmem_exit+0x6c)[0x558b2aa87eb1] > postgres: logical replication apply worker for subscription 16385 > (+0x4b5faa)[0x558b2aa87faa] > postgres: logical replication apply worker for subscription 16385 > (proc_exit+0xc)[0x558b2aa88031] > postgres: logical replication apply worker for subscription 16385 > (StartBackgroundWorker+0x147)[0x558b2aa0b4d9] > postgres: logical replication apply worker for subscription 16385 > (+0x43fdc1)[0x558b2aa11dc1] > postgres: logical replication apply worker for subscription 16385 > (+0x43ff3d)[0x558b2aa11f3d] > postgres: logical replication apply worker for subscription 16385 > (+0x440866)[0x558b2aa12866] > postgres: logical replication apply worker for subscription 16385 > (+0x440e12)[0x558b2aa12e12] > postgres: logical replication apply worker for subscription 16385 > (BackgroundWorkerInitializeConnection+0x0)[0x558b2aa14396] > postgres: logical replication apply worker for subscription 16385 > (main+0x21a)[0x558b2a932e21] > > I understand, that removing libpqwalreceiver.so (or whole pginst/) is not > what happens in a production environment every day, but nonetheless it's a > new failure mode and it can produce many coredumps when testing. > > IIUC, that assert will fail in case of any error raised between > ApplyWorkerMain()->logicalrep_worker_attach()->before_shmem_exit() and > ApplyWorkerMain()->InitializeApplyWorker()->BackgroundWorkerInitializeC > onnectionByOid()->InitPostgres(). Thanks for reporting the issue. I think the problem is that it tried to release locks in logicalrep_worker_onexit() before the initialization of the process is complete because this callback function was registered before the init phase. So I think we can add a conditional statement before releasing locks. Please find an attached patch. Best Regards, Hou zj
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alvaro HerreraДата:
Сообщение: Re: Add two missing tests in 035_standby_logical_decoding.pl