Re: [BUGS] Old row version in hot chain become visible after a freeze

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [BUGS] Old row version in hot chain become visible after a freeze
Дата
Msg-id 20170906221217.fnmoipi5zi5w3yrl@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [BUGS] Old row version in hot chain become visible after a freeze  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [BUGS] Old row version in hot chain become visible after a freeze  ("Wong, Yi Wen" <yiwong@amazon.com>)
Список pgsql-bugs
Michael Paquier wrote:

> frame #4: 0x00000001098fba6b postgres`FreezeMultiXactId(multi=34,
> t_infomask=4930, cutoff_xid=897, cutoff_multi=30,
> flags=0x00007fff56372fae) + 1179 at heapam.c:6532
>    6529                 * Since the tuple wasn't marked HEAPTUPLE_DEAD by vacuum, the
>    6530                 * update Xid cannot possibly be older than the xid cutoff.
>    6531                 */
> -> 6532                Assert(!TransactionIdIsValid(update_xid) ||
>    6533                       !TransactionIdPrecedes(update_xid, cutoff_xid));
>    6534
>    6535                /*
> (lldb) p update_xid
> (TransactionId) $0 = 896
> (lldb) p cutoff_xid
> (TransactionId) $1 = 897

So, looking at this closely, I think there is a bigger problem here: if
we use any of the proposed patches or approaches, we risk leaving an old
Xid in a tuple (because of skipping the heap_tuple_prepare_freeze on a
tuple which remains in the heap with live Xids), followed by later
truncating pg_multixact / pg_clog removing a segment that might be
critical to resolving this tuple status later on.

I think doing the tuple freezing dance for any tuple we don't remove
from the heap is critical, not optional.  Maybe a later HOT pruning
would save you from actual disaster, but I think it'd be a bad idea to
rely on that.

So ISTM we need a different solution than what's been proposed so far;
and I think that solution is different for each of the possible problem
cases, which are two: HEAPTUPLE_DEAD and HEAPTUPLE_RECENTLY_DEAD.

I think we can cover the HEAPTUPLE_DEAD case by just redoing the
heap_page_prune (just add a "goto" back to it if we detect the case).
It's a bit wasteful because we'd re-process all the prior tuples in the
loop below, but since it's supposed to be an infrequent condition, I
think it should be okay.

The RECENTLY_DEAD case is interesting.  We know that the updater is
committed, and since the update XID is older than the cutoff XID, then
we know nobody else can see the tuple.  So we can simply remove it ...
and we already have a mechanism for that: return FRM_MARK_COMMITTED in
FreezeMultiXactId.  But the code already does that!  The only thing we
need in order for this to be handled correctly is to remove the assert.

A case I didn't think about yet is RECENTLY_DEAD if the xmax is a plain
Xid (not a multi).  My vague feeling is that there is no bug here.

I haven't actually tested this.  Planning to look into it tomorrow.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: [BUGS] BUG #14799: SELECT * FROM transition_table in astatement-level trigger
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [BUGS] Old row version in hot chain become visible after a freeze