Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

Поиск
Список
Период
Сортировка
Hi,

On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
> I got curious and looked at the underlying problem here and I am
> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
> seems to me that the code is always going to return true if there are
> any active snapshots, and the rest of the function is intended to test
> whether there is a registered snapshot other than the catalog
> snapshot. But I don't think that's what this code does:
> 
>      if (pairingheap_is_empty(&RegisteredSnapshots) ||
>          !pairingheap_is_singular(&RegisteredSnapshots))
>          return false;
> 
>      return CatalogSnapshot == NULL;

Certainly looks off...


> I find that 'make check-world' passes with this change, which is
> disturbing, because it also passes without this change. That means we
> don't have any tests that reach HaveRegisteredOrActiveSnapshot() with
> more than one registered snapshot.

Part of that is because right now the assertion is placed "too deep" -
it should be much higher up, so it's reached even if there's not
actually a toast datum. But there's of other bugs preventing that :(. A
lot of bugs have been hidden by the existence of CatalogSnapshot (which
of course isn't something one actually can rely on).


> Also, unless we have plans to use HaveRegisteredOrActiveSnapshot() in
> more places,

I think we should, but there's the other bugs that need to be fixed
first :(. Namely that we have plenty places doing catalog accesses
without an active or registered snapshot :(.


> I think we should consider revising the whole approach here. The way
> init_toast_snapshot() is coded, we basically have some code that tests
> for active and registered snapshots and finds the oldest one. We error
> out if there isn't one. And then we get to this assertion, which
> checks the same stuff a second time but with an additional check to
> see whether we ended up with the catalog snapshot.  Wouldn't it make
> more sense if GetOldestSnapshot() just refused to return the catalog
> snapshot in the first place?

I'm worried that that could cause additional bugs. Consider code using
GetOldestSnapshot() to check if tuples need to be preserved or such.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: fix cost subqueryscan wrong parallel cost
Следующее
От: Andres Freund
Дата:
Сообщение: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)