Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash
Дата
Msg-id CA+TgmoYSg+1PtN_wpqPD8f8Xfepvkf_5PPewc9DRuYGipBWiLg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Fri, Jul 21, 2017 at 1:31 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Thanks Neha.  It's be best to post the back trace and if possible
> print oldestXact and ShmemVariableCache->oldestXid from the stack
> frame for TruncateCLOG.
>
> The failing assertion in TruncateCLOG() has a comment that says
> "vac_truncate_clog already advanced oldestXid", but vac_truncate_clog
> calls SetTransactionIdLimit() to write ShmemVariableCache->oldestXid
> *after* it calls TruncateCLOG().  What am I missing here?

This problem was introduced by commit
ea42cc18c35381f639d45628d792e790ff39e271, so this should be added to
the PostgreSQL 10 open items list. That commit intended to introduce a
distinction between (1) the oldest XID that can be safely examined and
(2) the oldest XID that can't yet be safely reused.  These are the
same except when we're in the middle of truncating CLOG: (1) advances
before the truncation, and (2) advances afterwards. That's why
AdvanceOldestClogXid() happens before truncation proper and
SetTransactionIdLimit() happens afterwards, and changing the order
would, I think, be quite wrong.

AFAICS, that assertion is simply a holdover from an earlier version of
the patch that escaped review.  There's just no reason to suppose that
it's true.

> What actually prevents ShmemVariableCache->oldestXid from going
> backwards anyway?  Suppose there are two or more autovacuum processes
> that reach vac_truncate_clog() concurrently.  They do a scan of
> pg_database whose tuples they access without locking through a
> pointer-to-volatile because they expect concurrent in-place writers,
> come up with a value for frozenXID, and then arrive at
> SetTransactionIdLimit() in whatever order and clobber
> ShmemVariableCache->oldestXid.  What am I missing here?

Hmm, there could be a bug there, but I don't think it's *this* bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] typo for using "OBJECT_TYPE" for "security label ondomain" in "gram.y"
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Macros bundling RELKIND_* conditions