Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Дата
Msg-id 20220220030128.sgytb3wccteb3opj@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Hi,

On 2022-02-19 18:16:54 -0800, Peter Geoghegan wrote:
> On Sat, Feb 19, 2022 at 5:54 PM Andres Freund <andres@anarazel.de> wrote:
> > How does that cause the endless loop?
> 
> Attached is the page image itself, dumped via gdb (and gzip'd). This
> was on recent HEAD (commit 8f388f6f, actually), plus
> 0001-Add-adversarial-ConditionalLockBuff[...]. No other changes. No
> defragmenting in pg_surgery, nothing like that.

> > It doesn't do so on HEAD + 0001-Add-adversarial-ConditionalLockBuff[...] for
> > me. So something needs have changed with your patch?
> 
> It doesn't always happen -- only about half the time on my machine.
> Maybe it's timing sensitive?

Ah, I'd only run the tests three times or so, without it happening. Trying a
few more times repro'd it.


It's kind of surprising that this needs this
0001-Add-adversarial-ConditionalLockBuff to break. I suspect it's a question
of hint bits changing due to lazy_scan_noprune(), which then makes
HeapTupleHeaderIsHotUpdated() have a different return value, preventing the
"If the tuple is DEAD and doesn't chain to anything else"
path from being taken.


> We hit the "goto retry" on offnum 2, which is the first tuple with
> storage (you can see "the ghost" of the tuple from the LP_DEAD item at
> offnum 1, since the page isn't defragmented in pg_surgery). I think
> that this happens because the heap-only tuple at offnum 2 is fully
> DEAD to lazy_scan_prune, but hasn't been recognized as such by
> heap_page_prune. There is no way that they'll ever "agree" on the
> tuple being DEAD right now, because pruning still doesn't assume that
> an orphaned heap-only tuple is fully DEAD.

> We can either do that, or we can throw an error concerning corruption
> when heap_page_prune notices orphaned tuples. Neither seems
> particularly appealing. But it definitely makes no sense to allow
> lazy_scan_prune to spin in a futile attempt to reach agreement with
> heap_page_prune about a DEAD tuple really being DEAD.

Yea, this sucks. I think we should go for the rewrite of the
heap_prune_chain() logic. The current approach is just never going to be
robust.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations