Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Дата
Msg-id CAB7nPqQT6hSvL0Tn64QBNXpABcOF1bsGxW4WN+TRZu0GH5PZmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple  ("Wood, Dan" <hexpert@amazon.com>)
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Wed, Oct 4, 2017 at 8:10 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> I now think that it actually is a VACUUM problem, specifically a
> problem with VACUUM pruning. You see the HOT xmin-to-xmax check
> pattern that you mentioned within heap_prune_chain(), which looks like
> where the incorrect tuple prune (or possibly, at times, redirect?)
> takes place. (I refer to the prune/kill that you mentioned today, that
> frustrated your first attempt at a fix -- "I modified the multixact
> freeze code...".)

My lookup of the problem converges to the same conclusion. Something
is wrong with the vacuum's pruning. I have spent some time trying to
design a patch, all the solutions I tried have proved to make the
problem harder to show up, but it still showed up, sometimes after
dozens of attempts.

> The attached patch "fixes" the problem -- I cannot get amcheck to
> complain about corruption with this applied. And, "make check-world"
> passes. Hopefully it goes without saying that this isn't actually my
> proposed fix. It tells us something that this at least *masks* the
> problem, though; it's a start.

Yep.

> FYI, the repro case page contents looks like this with the patch applied:
> postgres=# select lp, lp_flags, t_xmin, t_xmax, t_ctid,
> to_hex(t_infomask) as infomask,
> to_hex(t_infomask2) as infomask2
> from heap_page_items(get_raw_page('t', 0));
>  lp | lp_flags | t_xmin  | t_xmax | t_ctid | infomask | infomask2
> ----+----------+---------+--------+--------+----------+-----------
>   1 |        1 | 1845995 |      0 | (0,1)  | b02      | 3
>   2 |        2 |         |        |        |          |
>   3 |        0 |         |        |        |          |
>   4 |        0 |         |        |        |          |
>   5 |        0 |         |        |        |          |
>   6 |        0 |         |        |        |          |
>   7 |        1 | 1846001 |      0 | (0,7)  | 2b02     | 8003
> (7 rows)

Is lp_off for tid (0,2) pointing to (0,7)? A hot chain preserved is
what would look correct to me.

-         * Check the tuple XMIN against prior XMAX, if any
-         */
-        if (TransactionIdIsValid(priorXmax) &&
-            !TransactionIdEquals(HeapTupleHeaderGetXmin(htup), priorXmax))
-            break;
If you remove this check, you could also remove completely priorXmax.

Actually, I may be missing something, but why is priorXmax updated
even for dead tuples? For example just doing that is also taking care
of the problem:
@@ -558,7 +558,8 @@ heap_prune_chain(Relation relation, Buffer buffer,
OffsetNumber rootoffnum,       Assert(ItemPointerGetBlockNumber(&htup->t_ctid) ==
BufferGetBlockNumber(buffer));      offnum = ItemPointerGetOffsetNumber(&htup->t_ctid);
 
-       priorXmax = HeapTupleHeaderGetUpdateXid(htup);
+       if (!tupdead)
+           priorXmax = HeapTupleHeaderGetUpdateXid(htup);   }
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Issues with logical replication
Следующее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands