Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Дата
Msg-id CA+TgmoZ0PzoGMRBU-NOEx8YLEf5LeBF8TXiht5r0zeEVNpeT7g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)  (Andres Freund <andres@anarazel.de>)
Ответы Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Tue, May 10, 2016 at 3:05 AM, Andres Freund <andres@anarazel.de> wrote:
> The easy way to trigger this problem would be to have an oid wraparound
> - but the WAL shows that that's not the case here.  I've not figured
> that one out entirely (and won't tonight). But I do see WAL records
> like:
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/12004018, prev 2/12003288, desc: NEXTOID
4302693
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 2/1327EA08, prev 2/1327DC60, desc: NEXTOID
4302693
> i.e. two NEXTOID records allocating the same range, which obviously
> doesn't seem right.  There's also every now and then close by ranges:
> rmgr: XLOG        len (rec/tot):      4/    30, tx:          0, lsn: 1/9A404DB8, prev 1/9A404270, desc: NEXTOID
3311455
> rmgr: XLOG        len (rec/tot):      4/    30, tx:    7814505, lsn: 1/9A4EC888, prev 1/9A4EB9D0, desc: NEXTOID
3311461
>
>
> As far as I can see something like the above, or an oid wraparound, are
> pretty much deadly for toast.
>
> Is anybody ready with a good defense for SatisfiesToast not doing any
> actual liveliness checks?

I assume that this was installed as a performance optimization, and I
don't really see why it shouldn't be or be able to be made safe.  I
assume that the wraparound case was deemed safe because at that time
the idea of 4 billion OIDs getting used with old transactions still
active seemed inconceivable.  It seems to me that the real question
here is how you're getting two calls to XLogPutNextOid() with the same
value of ShmemVariableCache->nextOid, and the answer, as it seems to
me, must be that LWLocks are broken.  Either two processes are
managing to hold OidGenLock in exclusive mode at the same time, or
they're acquiring it in quick succession but without the second
process seeing all of the updates performed by the first process.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rajeev rastogi
Дата:
Сообщение: Re: asynchronous and vectorized execution
Следующее
От: Amit Kapila
Дата:
Сообщение: Hash Indexes