Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication
Дата
Msg-id 20191127142204.GA13686@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
On 2019-Nov-19, Andres Freund wrote:

> > > 2525:rmgr: Heap        len (rec/tot):   1671/  1671, tx: 1667601530, lsn: 25EE/D66F5508, prev 25EE/D66F54C8,
desc:INSERT off 2, blkref #0: rel 1663/13018/88964792 blk 18475
 
> > > 2532:rmgr: Heap        len (rec/tot):     54/    54, tx: 1667601530, lsn: 25EE/D66F7EF0, prev 25EE/D66F7EA0,
desc:DELETE off 2 , blkref #0: rel 1663/13018/88964792 blk 18475
 

> Hm. I don't think it necessarily means that. compute_new_xmax_infomask(), when
> called from heap_delete(), will e.g. set HEAP_KEYS_UPDATED when
> 
>     if (old_infomask & HEAP_XMAX_INVALID)
> 
> is true. Which is the most common case. I actually find it *more*
> curious to *not* see that for a delete. I've not traced fully through
> it, yet.
> 
> 
> Alvaro, it's not clear to me whether the the multixact codepaths in
> compute_new_xmax_infomask() actually work as expected for deletes.

Hmm ... I think you're complaining about failing to set the
HEAP_KEYS_UPDATED bit when the delete transaction is the same
as the insert transaction.

Grepping the new waldump file the OP sent for relfilenode 88964792
I saw this:

rmgr: Heap        len (rec/tot):   1671/  1671, tx: 1667601529, lsn: 25EE/D66F1A78, prev 25EE/D66F1A38, desc: INSERT
off5, blkref #0: rel 1663/13018/88964792 blk 18460
 
rmgr: Heap        len (rec/tot):   1671/  1671, tx: 1667601530, lsn: 25EE/D66F5508, prev 25EE/D66F54C8, desc: INSERT
off2, blkref #0: rel 1663/13018/88964792 blk 18475
 
rmgr: Heap        len (rec/tot):     48/    48, tx: 1667601529, lsn: 25EE/D66F7E10, prev 25EE/D66F6778, desc:
HEAP_CONFIRMoff 5, blkref #0: rel 1663/13018/88964792 blk 18460
 
rmgr: Heap        len (rec/tot):     54/    54, tx: 1667601530, lsn: 25EE/D66F7EF0, prev 25EE/D66F7EA0, desc: DELETE
off2 , blkref #0: rel 1663/13018/88964792 blk 18475
 
rmgr: Heap        len (rec/tot):     54/    54, tx: 1667601536, lsn: 25EE/D6BA3BC0, prev 25EE/D6BA3B70, desc: LOCK off
5:xid 1667601536: flags 0 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/13018/88964792 blk 20806
 
rmgr: Heap        len (rec/tot):   3955/  3955, tx: 1667601536, lsn: 25EE/D6BA3D00, prev 25EE/D6BA3CB0, desc: UPDATE
off5 xmax 1667601536 ; new off 2 xmax 0, blkref #0: rel 1663/13018/88964792 blk 18491, blkref #1: rel
1663/13018/88964792blk 20806
 
rmgr: Heap        len (rec/tot):     54/    54, tx: 1667601538, lsn: 25EE/D6DE53D0, prev 25EE/D6DE33B0, desc: LOCK off
5:xid 1667601538: flags 0 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/13018/88964792 blk 18460
 
rmgr: Heap        len (rec/tot):   6563/  6563, tx: 1667601538, lsn: 25EE/D6DE5408, prev 25EE/D6DE53D0, desc: UPDATE
off5 xmax 1667601538 ; new off 3 xmax 0, blkref #0: rel 1663/13018/88964792 blk 18499, blkref #1: rel
1663/13018/88964792blk 18460
 

I wonder about the HEAP_CONFIRM there ... it's in a completely different
transaction of course, so it's not related, but note that the
corresponding insert (25EE/D66F1A78) looks like a regular INSERT, it
doesn't indicate that it's a speculative insert, which it must have
been or there wouldn't be a confirm.  So I wonder if our insert just
below is *also* a speculative insert ... which perhaps could cause the
compute_new_xmax_infomask code to misbehave.

I think pg_waldump should print out the speculativeness of an insertion.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Etsuro Fujita
Дата:
Сообщение: Re: BUG #16139: Assertion fails on INSERT into a postgres_fdw' tablewith two AFTER INSERT triggers
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #16125: Crash of PostgreSQL's wal sender during logicalreplication