Hi,
On 2020-03-28 12:30:40 -0700, Andres Freund wrote:
> On 2020-02-28 22:10:52 -0800, Andres Freund wrote:
> > On 2020-02-28 21:24:59 -0800, Andres Freund wrote:
> > So, um. What happens is that doDeletion() does a catalog scan, which
> > sets a snapshot. The results of that catalog scan are then used to
> > perform modifications. But at that point there's no guarantee that we
> > still hold *any* snapshot, as e.g. invalidations can trigger the catalog
> > snapshot being released.
> >
> > I don't see how that's safe. Without ->xmin preventing that,
> > intermediate row versions that we did look up could just get vacuumed
> > away, and replaced with a different row. That does seem like a serious
> > issue?
> >
> > I think there's likely a lot of places that can hit this? What makes it
> > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release
> > ->xmin when there previously has been *any* catalog access? Because in
> > contrast to normal table modifications, there's currently nothing at all
> > forcing us to hold a snapshot between catalog lookups an their
> > modifications?
> >
> > Am I missing something? Or is this a fairly significant hole in our
> > arrangements?
> >
> > The easiest way to fix this would probably be to have inval.c call a
> > version of InvalidateCatalogSnapshot() that leaves the oldest catalog
> > snapshot around, but sets up things so that GetCatalogSnapshot() will
> > return a freshly taken snapshot? ISTM that pretty much every
> > InvalidateCatalogSnapshot() call within a transaction needs that behaviour?
>
> I'd still like to get some input here.
Attached is a one patch that adds assertions to detect this, and one
that puts enough workarounds in place to make the tests pass. I don't
like this much, but I thought it'd be useful for others to understand
the problem.
Greetings,
Andres Freund