Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple

Поиск
Список
Период
Сортировка
От Wood, Dan
Тема Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple
Дата
Msg-id 10CE2AE7-B1AC-4E81-BF00-E9EF4BA855EE@amazon.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple  (Peter Geoghegan <pg@bowt.ie>)
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
I’ve just started looking at this again after a few weeks break.

There is a tangled web of issues here.  With the community fix we get a corrupted page(invalid redirect ptr from
indexeditem).  The cause of that is:
 
pruneheap.c:
                 /*                  * Check the tuple XMIN against prior XMAX, if any                  */
  if (TransactionIdIsValid(priorXmax) &&                         !TransactionIdEquals(HeapTupleHeaderGetXmin(htup),
priorXmax))                        break;
 
chainitems[nchain++] = offnum;

The priorXmax is a multixact key share lock, thus not equal to xmin.  This terminates the scan.  Unlike the scan
terminationwith “if (!recent_dead) break;” below the switch (...SatisiesVacuum…), this “break” does not put the offnum
intothe chain even though it is in the chain.  If the first not-deleted item isn’t put in the chain then we’ll not call
heap_prune_record_redirect().

I do not know what the above ‘if’ test is protecting.  Normally the xmin is equal to the priorXmax.  Other than
protectingagainst corruption a key share lock can cause this.  So, I tried a fix which does the “if” check after adding
itto chainitems.  This will break whatever real situation this IF was protecting.  Tom Lane put this in.
 

OK:  With that hack of a fix the redirect now works correctly.  However, we still get the reindex problem with not
findingthe parent.  That problem is caused by:
 
Pruneheap.c:heap_get_root_tuples()
    if (TransactionIdIsValid(priorXmax) &&        !TransactionIdEquals(priorXmax, HeapTupleHeaderGetXmin(htup)))
break;

In this case, instead of these not being equal because of a multixact key share lock, it is because XMIN is frozen and
FrozenTransactionIddoesn’t equal the priorXmax.  Thus, we don’t build the entire chain from root to most current row
versionand this causes the reindex failure.
 

If we disable this ‘if’ as a hack then we no longer get a problem on the reindex.  However, YiWen reported that at the
endof an install check out index checking reported corruption in the system catalogues.  So we are still looking.
 

We need to understand why these TXID equal checks exist.  Can we differentiate the cases they are protecting against
withthe two exceptions I’ve found?
 

FYI, someone should look at the same ”if”  test in heapam.c: heap_lock_updated_tuple_rec().  Also, I hope there are no
strangeissues with concurrent index builds.
 

Finally, the idea behind the original fix was to simply NOT to do an unsupported freeze on a dead tuple.  It had two
drawbacks:
1) CLOG truncation.  This could have been handled by keeping track of the old unpruned item found and using that to
updatethe table’s/DB’s freeze xid.
 
2) Not making freeze progress.   The only reason the prune would fail should be because of an open TXN.  Unless that
TXNwas so old such that it’s XID was as old as the ?min freeze threshold? then we would make progress.  If we were
doingTXN’s that old then we’d be having problems anyway.
 


On 10/3/17, 5:15 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
   On Wed, Oct 4, 2017 at 8:10 AM, Peter Geoghegan <pg@bowt.ie> wrote:   > I now think that it actually is a VACUUM
problem,specifically a   > problem with VACUUM pruning. You see the HOT xmin-to-xmax check   > pattern that you
mentionedwithin heap_prune_chain(), which looks like   > where the incorrect tuple prune (or possibly, at times,
redirect?)  > takes place. (I refer to the prune/kill that you mentioned today, that   > frustrated your first attempt
ata fix -- "I modified the multixact   > freeze code...".)      My lookup of the problem converges to the same
conclusion.Something   is wrong with the vacuum's pruning. I have spent some time trying to   design a patch, all the
solutionsI tried have proved to make the   problem harder to show up, but it still showed up, sometimes after   dozens
ofattempts.      > The attached patch "fixes" the problem -- I cannot get amcheck to   > complain about corruption with
thisapplied. And, "make check-world"   > passes. Hopefully it goes without saying that this isn't actually my   >
proposedfix. It tells us something that this at least *masks* the   > problem, though; it's a start.      Yep.      >
FYI,the repro case page contents looks like this with the patch applied:   > postgres=# select lp, lp_flags, t_xmin,
t_xmax,t_ctid,   > to_hex(t_infomask) as infomask,   > to_hex(t_infomask2) as infomask2   > from
heap_page_items(get_raw_page('t',0));   >  lp | lp_flags | t_xmin  | t_xmax | t_ctid | infomask | infomask2   >
----+----------+---------+--------+--------+----------+-----------  >   1 |        1 | 1845995 |      0 | (0,1)  | b02
   | 3   >   2 |        2 |         |        |        |          |   >   3 |        0 |         |        |        |
    |   >   4 |        0 |         |        |        |          |   >   5 |        0 |         |        |        |
   |   >   6 |        0 |         |        |        |          |   >   7 |        1 | 1846001 |      0 | (0,7)  | 2b02
  | 8003   > (7 rows)      Is lp_off for tid (0,2) pointing to (0,7)? A hot chain preserved is   what would look
correctto me.      -         * Check the tuple XMIN against prior XMAX, if any   -         */   -        if
(TransactionIdIsValid(priorXmax)&&   -            !TransactionIdEquals(HeapTupleHeaderGetXmin(htup), priorXmax))   -
       break;   If you remove this check, you could also remove completely priorXmax.      Actually, I may be missing
something,but why is priorXmax updated   even for dead tuples? For example just doing that is also taking care   of the
problem:  @@ -558,7 +558,8 @@ heap_prune_chain(Relation relation, Buffer buffer,   OffsetNumber rootoffnum,
Assert(ItemPointerGetBlockNumber(&htup->t_ctid)==                  BufferGetBlockNumber(buffer));           offnum =
ItemPointerGetOffsetNumber(&htup->t_ctid);  -       priorXmax = HeapTupleHeaderGetUpdateXid(htup);   +       if
(!tupdead)  +           priorXmax = HeapTupleHeaderGetUpdateXid(htup);       }   --    Michael      
 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple