Обсуждение: heap_delete, heap_mark4update must reset t_ctid

Поиск
Список
Период
Сортировка

heap_delete, heap_mark4update must reset t_ctid

От
Tom Lane
Дата:
I have been looking at an example of the "no one parent tuple found"
VACUUM error provided by Mario Weilguni.  It appears to me that VACUUM
is getting confused by a tuple that looks like so in pg_filedump:
Item   4 -- Length:  249  Offset: 31616 (0x7b80)  Flags: USED OID: 0  CID: min(240) max(18)  XID: min(5691267)
max(6484551)Block Id: 1  linp Index: 1   Attributes: 38   Size: 40 infomask: 0x3503
(HASNULL|HASVARLENA|XMIN_COMMITTED|XMAX_COMMITTED|MARKED_FOR_UPDATE|UPDATED)

Notice that the t_ctid field is not pointing to this tuple, but to a
different item on the same page (which in fact is an unused item).
This causes VACUUM to believe that the tuple is part of an update chain.
But in point of fact it is not part of a chain (indeed there are *no*
chains in the test relation, thus leading to the observed failure).

As near as I can tell, the sequence of events was:

1. this row was updated by a transaction that stored the updated version
in lineindex 1, but later aborted.  t_ctid is left pointing to linp 1.

2. Some other transaction came along, marked the row FOR UPDATE, and
committed (with no actual update).

So we now have XMAX_COMMITTED and t_ctid != t_self, which looks way too
much like a tuple that's been updated, when in fact it is the latest
good version of its row.

I think an appropriate fix would be to reset t_ctid to equal t_self
whenever we clear XMAX_INVALID, which in practice means heap_delete and
heap_mark4update need to do this.  (heap_update also clears
XMAX_INVALID, but of course it's setting t_ctid to point to the updated
tuple.)

Comments?
        regards, tom lane


Re: heap_delete, heap_mark4update must reset t_ctid

От
Barry Lind
Дата:
Tom,

When/if you have a patch for this, I would like to test it.  I still
have a copy of a database showing the same problem that I would like to
test this on when it is ready.

thanks,
--Barry

Tom Lane wrote:
>I have been looking at an example of the "no one parent tuple found">VACUUM error provided by Mario Weilguni.  It
appearsto me that VACUUM>is getting confused by a tuple that looks like so in pg_filedump:>> Item   4 -- Length:  249
Offset:31616 (0x7b80)  Flags: USED>  OID: 0  CID: min(240) max(18)  XID: min(5691267) max(6484551)>  Block Id: 1  linp
Index:1   Attributes: 38   Size: 40>  infomask: 0x3503 
 
(HASNULL|HASVARLENA|XMIN_COMMITTED|XMAX_COMMITTED|MARKED_FOR_UPDATE|UPDATED)>>Notice that the t_ctid field is not
pointingto this tuple, but to a>different item on the same page (which in fact is an unused item).>This causes VACUUM
tobelieve that the tuple is part of an update chain.>But in point of fact it is not part of a chain (indeed there are
*no*>chainsin the test relation, thus leading to the observed failure).>>As near as I can tell, the sequence of events
was:>>1.this row was updated by a transaction that stored the updated version>in lineindex 1, but later aborted.
t_ctidis left pointing to linp 1.>>2. Some other transaction came along, marked the row FOR UPDATE, and>committed (with
noactual update).>>So we now have XMAX_COMMITTED and t_ctid != t_self, which looks way too>much like a tuple that's
beenupdated, when in fact it is the latest>good version of its row.>>I think an appropriate fix would be to reset
t_ctidto equal t_self>whenever we clear XMAX_INVALID, which in practice means heap_delete and>heap_mark4update need to
dothis.  (heap_update also clears>XMAX_INVALID, but of course it's setting t_ctid to point to the
updated>tuple.)>>Comments?>>    regards, tom lane>>---------------------------(end of
broadcast)--------------------------->TIP6: Have you searched our list archives?>>http://archives.postgresql.org>>>
 





Re: heap_delete, heap_mark4update must reset t_ctid

От
Bruce Momjian
Дата:
Has this been fixed?  I think we did.

---------------------------------------------------------------------------

Tom Lane wrote:
> I have been looking at an example of the "no one parent tuple found"
> VACUUM error provided by Mario Weilguni.  It appears to me that VACUUM
> is getting confused by a tuple that looks like so in pg_filedump:
> 
>  Item   4 -- Length:  249  Offset: 31616 (0x7b80)  Flags: USED
>   OID: 0  CID: min(240) max(18)  XID: min(5691267) max(6484551)
>   Block Id: 1  linp Index: 1   Attributes: 38   Size: 40
>   infomask: 0x3503 (HASNULL|HASVARLENA|XMIN_COMMITTED|XMAX_COMMITTED|MARKED_FOR_UPDATE|UPDATED)
> 
> Notice that the t_ctid field is not pointing to this tuple, but to a
> different item on the same page (which in fact is an unused item).
> This causes VACUUM to believe that the tuple is part of an update chain.
> But in point of fact it is not part of a chain (indeed there are *no*
> chains in the test relation, thus leading to the observed failure).
> 
> As near as I can tell, the sequence of events was:
> 
> 1. this row was updated by a transaction that stored the updated version
> in lineindex 1, but later aborted.  t_ctid is left pointing to linp 1.
> 
> 2. Some other transaction came along, marked the row FOR UPDATE, and
> committed (with no actual update).
> 
> So we now have XMAX_COMMITTED and t_ctid != t_self, which looks way too
> much like a tuple that's been updated, when in fact it is the latest
> good version of its row.
> 
> I think an appropriate fix would be to reset t_ctid to equal t_self
> whenever we clear XMAX_INVALID, which in practice means heap_delete and
> heap_mark4update need to do this.  (heap_update also clears
> XMAX_INVALID, but of course it's setting t_ctid to point to the updated
> tuple.)
> 
> Comments?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: heap_delete, heap_mark4update must reset t_ctid

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Has this been fixed?  I think we did.

Yes.
        regards, tom lane


Re: heap_delete, heap_mark4update must reset t_ctid

От
Bruce Momjian
Дата:
As you can see, there is a lot of cruft left in my mailbox, but there
are some items that we left behind that may be fixable before 7.3.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Has this been fixed?  I think we did.
> 
> Yes.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073