Andres Freund wrote:
> On 2014-03-31 09:19:12 -0300, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > > On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:
> > > > My conclusion here is that some part of the code is failing to examine
> > > > XMAX_INVALID before looking at the value stored in xmax itself. There
> > > > ought to be a short-circuit. Fortunately, this bug should be pretty
> > > > harmless.
> > > >
> > > > .. and after looking, I'm fairly sure the bug is in
> > > > heap_tuple_needs_freeze.
> > >
> > > heap_tuple_needs_freeze() isn't *allowed* to look at
> > > XMAX_INVALID. Otherwise it could miss freezing something still visible
> > > on a standby or after an eventual crash.
> >
> > Ah, you're right. It even says so on the comment at the top (no
> > caffeine yet.) But what it's doing is still buggy, per this report, so
> > we need to do *something* ...
>
> Are you sure needs_freeze() is the problem here?
>
> IIRC it already does some checks for allow_old? Why is the check for
> that not sufficient?
GetMultiXactIdMembers has this:
if (MultiXactIdPrecedes(multi, oldestMXact)){ ereport(allow_old ? DEBUG1 : ERROR,
(errcode(ERRCODE_INTERNAL_ERROR), errmsg("MultiXactId %u does no longer exist -- apparent wraparound",
multi))); return -1;}
if (!MultiXactIdPrecedes(multi, nextMXact)) ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg("MultiXactId%u has not been created yet -- apparent wraparound", multi)));
I guess I wasn't expecting that too-old values would last longer than a
full wraparound cycle. Maybe the right fix is just to have the second
check also conditional on allow_old.
Anyway, it's not clear to me why this database has a multixact value of
6 million when the next multixact value is barely above one million.
Stephen said a wraparound here is not likely.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services