Re: [BUGS] Old row version in hot chain become visible after a freeze

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [BUGS] Old row version in hot chain become visible after a freeze
Дата
Msg-id CAB7nPqTf70_QF7JnJZ=trZLamrYrm3soHbMUFLFQduDo39VBZQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [BUGS] Old row version in hot chain become visible after a freeze  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [BUGS] Old row version in hot chain become visible after a freeze  (Michael Paquier <michael.paquier@gmail.com>)
Re: [BUGS] Old row version in hot chain become visible after a freeze  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-bugs
On Fri, Sep 1, 2017 at 12:46 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Sep 1, 2017 at 12:25 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> On Thu, Aug 31, 2017 at 3:36 PM, Wood, Dan <hexpert@amazon.com> wrote:
>>> I’ve found a bug in Postgres which causes old row versions to appear in a
>>> table.  DEAD rows in a hot chain are getting frozen and becoming visible.
>>> I’ve repro’d this in both 9.6.1 and 11-devel.
>>
>> I can confirm that this goes all the way back to 9.3, where "for key
>> share"/foreign key locks were introduced.
>
> Hm. That looks pretty bad to me... It is bad luck that this report
> happens just after the last round of minor releases has been out, and
> we would have needed at least a couple of days and a couple of pairs
> of eyes to come up with a correct patch. (I haven't looked at the
> proposed solution and the attached patch yet, so no opinions yet).

Using a build where assertions are enabled, the provided test case
blows up before even returning results:   frame #3: 0x0000000109e8df90
postgres`ExceptionalCondition(conditionName="!(!((update_xid) !=
((TransactionId) 0)) || !TransactionIdPrecedes(update_xid,
cutoff_xid))", errorType="FailedAssertion", fileName="heapam.c",
lineNumber=6533) + 128 at assert.c:54   frame #4: 0x00000001098fba6b postgres`FreezeMultiXactId(multi=34,
t_infomask=4930, cutoff_xid=897, cutoff_multi=30,
flags=0x00007fff56372fae) + 1179 at heapam.c:6532   frame #5: 0x00000001098fb2cd
postgres`heap_prepare_freeze_tuple(tuple=0x000000010b04d0b0,
cutoff_xid=897, cutoff_multi=30, frz=0x00007fae0d800040,
totally_frozen_p="\x01\x02") + 285 at heapam.c:6673   frame #6: 0x0000000109af356d
postgres`lazy_scan_heap(onerel=0x000000010a6fe630, options=9,
vacrelstats=0x00007fae0a0580a8, Irel=0x00007fae0a058688, nindexes=1,
aggressive='\x01') + 4413 at vacuumlazy.c:1081   frame #7: 0x0000000109af1ef2
postgres`lazy_vacuum_rel(onerel=0x000000010a6fe630, options=9,
params=0x00007fff56373608, bstrategy=0x00007fae0a047c40) + 626 at
vacuumlazy.c:253
(lldb) up 1
frame #4: 0x00000001098fba6b postgres`FreezeMultiXactId(multi=34,
t_infomask=4930, cutoff_xid=897, cutoff_multi=30,
flags=0x00007fff56372fae) + 1179 at heapam.c:6532  6529                 * Since the tuple wasn't marked HEAPTUPLE_DEAD
by vacuum, the  6530                 * update Xid cannot possibly be older than the
xid cutoff.  6531                 */
-> 6532                Assert(!TransactionIdIsValid(update_xid) ||  6533
!TransactionIdPrecedes(update_xid,cutoff_xid));  6534  6535                /* 
(lldb) p update_xid
(TransactionId) $0 = 896
(lldb) p cutoff_xid
(TransactionId) $1 = 897

As the portion doing vacuum freeze is the one blowing up, I think that
it is possible to extract an isolation test and include it in the
patch with repro.sql as the initialization phase.
                /*
-                 * Each non-removable tuple must be checked to see if it needs
+                 * Each non-removable tuple that we do not keep must
be checked to see if it needs                 * freezing.  Note we already have exclusive buffer lock.
*/
-                if (heap_prepare_freeze_tuple(tuple.t_data, FreezeLimit,
+                if (!tupkeep)
+                {
+                    if (heap_prepare_freeze_tuple(tuple.t_data, FreezeLimit,
  MultiXactCutoff, 
&frozen[nfrozen],                                              &tuple_totally_frozen))
-                    frozen[nfrozen++].offset = offnum;
+                        frozen[nfrozen++].offset = offnum;

Wouldn't it be better to check if freeze is needed for the tuple
scanned with heap_tuple_needs_freeze() instead of inventing a new
custom condition? Please note as well that heap_tuple_needs_freeze()
does not need its last "buf" argument.
--
Michael


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Thom Brown
Дата:
Сообщение: Re: [BUGS] Can't read oprcode from remote pg_operator
Следующее
От: hubert depesz lubaczewski
Дата:
Сообщение: Re: [BUGS] BUG #14797: It's not safe to use MD5