Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
От | Alexander Lakhin |
---|---|
Тема | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? |
Дата | |
Msg-id | d6633d6c-e547-52d1-0798-a7848583b501@gmail.com обсуждение исходный текст |
Ответ на | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
(Thomas Munro <thomas.munro@gmail.com>)
|
Список | pgsql-hackers |
> 11.01.2022 23:16, Tom Lane wrote: >> Thomas Munro <thomas.munro@gmail.com> writes: >>> Ouch. I think our options at this point are: >>> 1. Revert 6051857fc (and put it back when we have a working >>> long-lived WES as I showed). This is not very satisfying, now that we >>> understand the bug, because even without that change I guess you must >>> be able to reach the hanging condition by using Windows postgres_fdw >>> to talk to a non-Windows server (ie a normal TCP stack with graceful >>> shutdown/linger on process exit). >> It'd be worth checking, perhaps. One thing I've been wondering all >> along is how much of this behavior is specific to the local-loopback >> case where Windows can see both ends of the connection. You'd think >> that they couldn't long get away with such blatant violations of the >> TCP specs when talking to external servers, because the failures >> would be visible to everyone with a web browser. I've split my test (both parts attached) and run it on two virtual machines with clean builds from master (ac7c8075) on both (just the debugging output added to connection.c). I provide probably redundant info (also see attached screenshot) just to make sure that I didn't make a mistake. The excerpt from 001_disconnection1_local.log: ... 2022-01-12 09:29:48.099 MSK|Administrator|postgres|61de755a.a54|LOG: pgfdw_get_result: before WaitLatchOrSocket 2022-01-12 09:29:48.099 MSK|Administrator|postgres|61de755a.a54|STATEMENT: SELECT * FROM large WHERE a = fx2(a) 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|LOG: pgfdw_get_result: after WaitLatchOrSocket 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|STATEMENT: SELECT * FROM large WHERE a = fx2(a) 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|LOG: pgfdw_get_result: before WaitLatchOrSocket 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|STATEMENT: SELECT * FROM large WHERE a = fx2(a) 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|LOG: pgfdw_get_result: after WaitLatchOrSocket 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|STATEMENT: SELECT * FROM large WHERE a = fx2(a) 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|ERROR: FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|CONTEXT: remote SQL command: FETCH 100 FROM c1 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|WARNING: no connection to the server 2022-01-12 09:29:48.100 MSK|Administrator|postgres|61de755a.a54|CONTEXT: remote SQL command: ABORT TRANSACTION 2022-01-12 09:29:48.107 MSK|Administrator|postgres|61de755a.a54|LOG: disconnection: session time: 0:00:01.577 user=Administrator database=postgres host=127.0.0.1 port=49752 2022-01-12 09:29:48.257 MSK|[unknown]|[unknown]|61de755c.a4c|LOG: connection received: host=127.0.0.1 port=49754 2022-01-12 09:29:48.261 MSK|Administrator|postgres|61de755c.a4c|LOG: connection authenticated: identity="WIN-FCPSOVMM1JC\Administrator" method=sspi (C:/src/postgrespro/contrib/postgres_fdw/tmp_check/t_001_disconnection1_local_data/pgdata/pg_hba.conf:2) 2022-01-12 09:29:48.261 MSK|Administrator|postgres|61de755c.a4c|LOG: connection authorized: user=Administrator database=postgres application_name=001_disconnection1.pl 2022-01-12 09:29:48.263 MSK|Administrator|postgres|61de755c.a4c|LOG: statement: SELECT * FROM large WHERE a = fx2(a) 2022-01-12 09:29:48.285 MSK|Administrator|postgres|61de755c.a4c|LOG: pgfdw_get_result: before WaitLatchOrSocket 2022-01-12 09:29:48.285 MSK|Administrator|postgres|61de755c.a4c|STATEMENT: SELECT * FROM large WHERE a = fx2(a) ... By the look of things, you are right and this is the localhost-only issue. I've rechecked that the test 001_disconnection.pl (local-loopback version) hangs on both machines while 001_disconnection1.pl performs successfully in both directions. I'm not sure whether the Windows client and non-Windows server or reverse combinations are of interest in light of the above. Best regards, Alexander
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
Следующее
От: Julien RouhaudДата:
Сообщение: Re: Proposal: allow database-specific role memberships