Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Дата
Msg-id 20160510230259.a2fojqtv6d76arbn@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On 2016-05-10 15:53:38 -0700, Jeff Janes wrote:
> On Tue, May 10, 2016 at 2:00 PM, Andres Freund <andres@anarazel.de> wrote:
> > I think that's to blame here.  Looking at the relevant WAL record shows:
> >
> > pg_xlogdump -p /data/freund/jj -s 2/12004018 -e 2/1327EA28|grep -E 'CHECKPOINT|NEXTOID'
> > rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/12004018, prev 2/12003288, desc: NEXTOID
4302693
> > rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 2/12023C38, prev 2/12023C00, desc:
CHECKPOINT_ONLINEredo 2/12000120; /* ... */ oid 4294501; /* ... */ online
 
>
> By my understanding, this is the point at which the crash occurred.

Right.

> > rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 2/1327A798, prev 2/1327A768, desc:
CHECKPOINT_SHUTDOWNredo 2/1327A798; /* ... */ oid 4294501; /* ... */ shutdown
 
> > rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/1327EA08, prev 2/1327DC60, desc: NEXTOID
4302693
> >
> > (note that end-of-recovery checkpoints are logged as shutdown
> > checkpoints, pretty annoying imo)
> >
> > So I think the issue is that the 2/12023C38 checkpoint *starts* before
> > the first NEXTOID, and thus gets an earlier nextoid.
>
>
> But isn't CreateCheckPoint called at the end of the checkpoint, not
> the start of it?

No, CreateCheckPoint() does it all.


CreateCheckPoint(int flags)
{
...   /* 1) determine redo pointer */   WALInsertLockAcquireExclusive();   curInsert =
XLogBytePosToRecPtr(Insert->CurrBytePos);  prevPtr = XLogBytePosToRecPtr(Insert->PrevBytePos);
WALInsertLockRelease();
...   /* 2) determine oid */   LWLockAcquire(OidGenLock, LW_SHARED);   checkPoint.nextOid =
ShmemVariableCache->nextOid;  if (!shutdown)       checkPoint.nextOid += ShmemVariableCache->oidCount;
LWLockRelease(OidGenLock);
...   /* 3) actually checkpoint shared_buffers et al. */   CheckPointGuts(checkPoint.redo, flags);
...   /* 4) log the checkpoint */   recptr = XLogInsert(RM_XLOG_ID,                       shutdown ?
XLOG_CHECKPOINT_SHUTDOWN:                       XLOG_CHECKPOINT_ONLINE);
 
...
}


> I don't understand how it could be out-of-date at that point.  But
> obviously it is.

A checkpoint logically "starts" at 1) in the above abbreviated
CreateCheckPoint(), that's where recovery starts when starting up from
that checkpoint. But inbetween 1) and 4) all other backends can continue
to insert WAL, and it'll be replayed *before* the checkpoint's record
itself.  That means that if some backend generates a NEXTOID record
between 2) and 4) (with largers checkpoints we're looking at minutes to
an hour of time), it's effects will temporarily take effect (as in
ShmemVariableCache->nextOid is updated), but XLOG_CHECKPOINT_ONLINE's
replay will overwrite it unconditionally:
void
xlog_redo(XLogReaderState *record)
{   else if (info == XLOG_CHECKPOINT_ONLINE)   {
...       /* ... but still treat OID counter as exact */       LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
ShmemVariableCache->nextOid= checkPoint.nextOid;       ShmemVariableCache->oidCount = 0;
LWLockRelease(OidGenLock);

Makes sense?

Regards,

Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: asynchronous and vectorized execution