Обсуждение: BF mamba failure
Hi, Peter Smith has recently reported a BF failure [1]. AFAICS, the call stack of failure [2] is as follows: 0x1e66644 <ExceptionalCondition+0x8c> at postgres 0x1d0143c <pgstat_release_entry_ref+0x4c0> at postgres 0x1d02534 <pgstat_get_entry_ref+0x780> at postgres 0x1cfb120 <pgstat_prep_pending_entry+0x8c> at postgres 0x1cfd590 <pgstat_report_disconnect+0x34> at postgres 0x1cfbfc0 <pgstat_shutdown_hook+0xd4> at postgres 0x1ca7b08 <shmem_exit+0x7c> at postgres 0x1ca7c74 <proc_exit_prepare+0x70> at postgres 0x1ca7d2c <proc_exit+0x18> at postgres 0x1cdf060 <PostgresMain+0x584> at postgres 0x1c203f4 <ServerLoop.isra.0+0x1e88> at postgres 0x1c2161c <PostmasterMain+0xfa4> at postgres 0x1edcf94 <main+0x254> at postgres I couldn't correlate it to the recent commits. Any thoughts? [1] - https://www.postgresql.org/message-id/CAHut%2BPsHdWFjU43VEX%2BR-8de6dFQ-_JWrsqs%3DvWek1hULexP4Q%40mail.gmail.com [2] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-03-17%2005%3A36%3A10 -- With Regards, Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes: > Peter Smith has recently reported a BF failure [1]. AFAICS, the call > stack of failure [2] is as follows: Note the assertion report a few lines further up: TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 560, PID:25004 regards, tom lane
Hi,
18.03.2023 07:26, Tom Lane wrote:
18.03.2023 07:26, Tom Lane wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:Peter Smith has recently reported a BF failure [1]. AFAICS, the call stack of failure [2] is as follows:Note the assertion report a few lines further up: TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 560, PID: 25004
This assertion failure can be reproduced easily with the attached patch:
============== running regression test queries ==============
test oldest_xmin ... ok 55 ms
test oldest_xmin ... FAILED (test process exited with exit code 1) 107 ms
test oldest_xmin ... FAILED (test process exited with exit code 1) 8 ms
============== shutting down postmaster ==============
contrib/test_decoding/output_iso/log/postmaster.log contains:
TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 561, PID: 456844
With the sleep placed above Assert(entry_ref->shared_entry->dropped) this Assert fails too.
Best regards,
Alexander
Вложения
On Sun, Mar 19, 2023 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > > Hi, > > 18.03.2023 07:26, Tom Lane wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > Peter Smith has recently reported a BF failure [1]. AFAICS, the call > stack of failure [2] is as follows: > > Note the assertion report a few lines further up: > > TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 560,PID: 25004 > > > This assertion failure can be reproduced easily with the attached patch: > ============== running regression test queries ============== > test oldest_xmin ... ok 55 ms > test oldest_xmin ... FAILED (test process exited with exit code 1) 107 ms > test oldest_xmin ... FAILED (test process exited with exit code 1) 8 ms > ============== shutting down postmaster ============== > > contrib/test_decoding/output_iso/log/postmaster.log contains: > TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 561,PID: 456844 > > With the sleep placed above Assert(entry_ref->shared_entry->dropped) this Assert fails too. > > Best regards, > Alexander I used a slightly modified* patch of Alexander's [1] applied to the latest HEAD code (but with my "toptxn" patch reverted). --- the patch was modified in that I injected 'sleep' both above and below the Assert(entry_ref->shared_entry->dropped). Using this I was also able to reproduce the problem. But test failures were rare. The make check-world seemed OK, and indeed the test_decoding tests would also appear to PASS around 14 out of 15 times. ============== running regression test queries ============== test oldest_xmin ... ok 342 ms test oldest_xmin ... ok 121 ms test oldest_xmin ... ok 283 ms ============== shutting down postmaster ============== ============== removing temporary instance ============== ===================== All 3 tests passed. ===================== ~~ Often (but not always) depite the test_decoding reported PASS all 3 tests as "ok", I still observed there was a TRAP in the logfile (contrib/test_decoding/output_iso/log/postmaster.log). TRAP: failed Assert("entry_ref->shared_entry->dropped") ~~ Occasionally (about 1 in 15 test runs) the test would fail the same way as described by Alexander [1], with the accompanying TRAP. TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 562, PID: 32013 ============== running regression test queries ============== test oldest_xmin ... ok 331 ms test oldest_xmin ... ok 91 ms test oldest_xmin ... FAILED 702 ms ============== shutting down postmaster ============== ====================== 1 of 3 tests failed. ====================== ~~ FWIW, the "toptxn" patch. whose push coincided with the build-farm error I first reported [2], turns out to be an innocent party in this TRAP. We know this because all of the above results were running using HEAD code but with that "toptxn" patch reverted. ------ [1] https://www.postgresql.org/message-id/1941b7e2-be7c-9c4c-8505-c0fd05910e9a%40gmail.com [2] https://www.postgresql.org/message-id/CAHut%2BPsHdWFjU43VEX%2BR-8de6dFQ-_JWrsqs%3DvWek1hULexP4Q%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Hello hackers, 20.03.2023 09:10, Peter Smith wrote: > > Using this I was also able to reproduce the problem. But test failures > were rare. The make check-world seemed OK, and indeed the > test_decoding tests would also appear to PASS around 14 out of 15 > times. I've stumbled upon this assertion failure again during testing following cd312adc5. This time I've simplified the reproducer to the attached modification. With this patch applied, `make -s check -C contrib/test_decoding` fails on master as below: ok 1 - pgstat_rc_1 14 ms not ok 2 - pgstat_rc_2 1351 ms contrib/test_decoding/output_iso/log/postmaster.log contains: TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 562, PID: 1130928 With extra logging added, I see the following events happening: 1) pgstat_rc_1.setup calls pgstat_create_replslot(), gets ReplicationSlotIndex(slot) = 0 and calls pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0). 2) pgstat_rc_1.s0_get_changes executes pg_logical_slot_get_changes(...) and then calls pgstat_gc_entry_refs on shmem_exit() -> pgstat_shutdown_hook() ...; with the sleep added inside pgstat_release_entry_ref, this backend waits after decreasing entry_ref->shared_entry->refcount to 0. 3) pgstat_rc_1.stop removes the replication slot. 4) pgstat_rc_2.setup calls pgstat_create_replslot(), gets ReplicationSlotIndex(slot) = 0 and calls pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0), which leads to the call pgstat_reinit_entry(), which increases refcount for the same shared_entry as in (1) and (2), and then to the call pgstat_acquire_entry_ref(), which increases refcount once more. 5) the backend 2 reaches Assert(pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0), which fails due to refcount = 2. Best regards, Alexander
Вложения
On Wed, Jun 12, 2024 at 08:00:00PM +0300, Alexander Lakhin wrote: > With extra logging added, I see the following events happening: > 1) pgstat_rc_1.setup calls pgstat_create_replslot(), gets > ReplicationSlotIndex(slot) = 0 and calls > pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0). > > 2) pgstat_rc_1.s0_get_changes executes pg_logical_slot_get_changes(...) > and then calls pgstat_gc_entry_refs on shmem_exit() -> > pgstat_shutdown_hook() ...; > with the sleep added inside pgstat_release_entry_ref, this backend waits > after decreasing entry_ref->shared_entry->refcount to 0. > > 3) pgstat_rc_1.stop removes the replication slot. > > 4) pgstat_rc_2.setup calls pgstat_create_replslot(), gets > ReplicationSlotIndex(slot) = 0 and calls > pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0), > which leads to the call pgstat_reinit_entry(), which increases refcount > for the same shared_entry as in (1) and (2), and then to the call > pgstat_acquire_entry_ref(), which increases refcount once more. > > 5) the backend 2 reaches > Assert(pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0), > which fails due to refcount = 2. Thanks for the details. So this comes down to the point that we offer no guarantee that the stats entry a backend sees at shutdown is the same as the one it wants to clean up. That's the same problem as what Floris has reported here, with an OID wraparound and tables to get the same hash key. That can happen for all stats kinds: https://www.postgresql.org/message-id/b14ae28029f64757bb64613be2549a74@Optiver.com I don't think that this is going to fly far except if we introduce a concept of "generation" or "age" in the stats entries. The idea is simple: when a stats entry is reinitialized because of a drop&create, increment a counter to tell that this is a new generation, and keep track of it in *both* PgStat_EntryRef (local backend reference to the shmem stats entry) *and* PgStatShared_HashEntry (the central one). When releasing an entry, if we know that the shared entry we are pointing at is not of the same generation as the local reference, it means that the entry has been reused for something else with the same hash key, so give up. It should not be that invasive, still it means ABI breakage in the two pgstats internal structures I am mentioning, which is OK for a backpatch as this stuff is internal. On top of that, this problem means that we can silently and randomly lose stats, which is not cool. Note that Noah has been working on a better integration of injection points with isolation tests. We could reuse that here to have a test case, with an injection point waiting around the pg_usleep() you have hardcoded: https://www.postgresql.org/message-id/20240614003549.c2.nmisch@google.com I'll try to give it a go on Monday. -- Michael
Вложения
On Fri, Jun 14, 2024 at 02:31:37PM +0900, Michael Paquier wrote: > I don't think that this is going to fly far except if we introduce a > concept of "generation" or "age" in the stats entries. The idea is > simple: when a stats entry is reinitialized because of a drop&create, > increment a counter to tell that this is a new generation, and keep > track of it in *both* PgStat_EntryRef (local backend reference to the > shmem stats entry) *and* PgStatShared_HashEntry (the central one). > When releasing an entry, if we know that the shared entry we are > pointing at is not of the same generation as the local reference, it > means that the entry has been reused for something else with the same > hash key, so give up. It should not be that invasive, still it means > ABI breakage in the two pgstats internal structures I am mentioning, > which is OK for a backpatch as this stuff is internal. On top of > that, this problem means that we can silently and randomly lose stats, > which is not cool. > > I'll try to give it a go on Monday. Here you go, the patch introduces what I've named an "age" counter attached to the shared entry references, and copied over to the local references. The countner is initialized at 0 and incremented each time an entry is reused, then when attempting to drop an entry we cross-check the version hold locally with the shared one. While looking at the whole, this is close to a concept patch sent previously, where a counter is used in the shared entry with a cross-check done with the local reference, that was posted here (noticed that today): https://www.postgresql.org/message-id/20230603.063418.767871221863527769.horikyota.ntt@gmail.com The logic is different though, as we don't need to care about the contents of the local cache when cross-checking the "age" count when retrieving the contents, just the case where a backend would attempt to drop an entry it thinks is OK to operate on, that got reused because of the effect of other backends doing creates and drops with the same hash key. This idea needs more eyes, so I am adding that to the next CF for now. I've played with it for a few hours and concurrent replication slot drops/creates, without breaking it. I have not implemented an isolation test for this case, as it depends on where we are going with their integration with injection points. -- Michael