Re: recovering from "found xmin ... from before relfrozenxid ..."

Поиск
Список
Период
Сортировка
От Ashutosh Sharma
Тема Re: recovering from "found xmin ... from before relfrozenxid ..."
Дата
Msg-id CAE9k0P=FaHvXZnsOeGDK_30=2DQtUD4vCg52TX3hmw+gbHUbkw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: recovering from "found xmin ... from before relfrozenxid ..."  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Ответы Re: recovering from "found xmin ... from before relfrozenxid ..."  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Aug 19, 2020 at 3:55 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 19 Aug 2020 at 15:09, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Wed, Aug 19, 2020 at 9:27 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Mon, 17 Aug 2020 at 15:05, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > >
> > > > > pg_force_freeze() can revival a tuple that is already deleted but not
> > > > > vacuumed yet. Therefore, the user might need to reindex indexes after
> > > > > using that function. For instance, with the following script, the last
> > > > > two queries: index scan and seq scan, will return different results.
> > > > >
> > > > > set enable_seqscan to off;
> > > > > set enable_bitmapscan to off;
> > > > > set enable_indexonlyscan to off;
> > > > > create table tbl (a int primary key);
> > > > > insert into tbl values (1);
> > > > >
> > > > > update tbl set a = a + 100 where a = 1;
> > > > >
> > > > > explain analyze select * from tbl where a < 200;
> > > > >
> > > > > -- revive deleted tuple on heap
> > > > > select heap_force_freeze('tbl', array['(0,1)'::tid]);
> > > > >
> > > > > -- index scan returns 2 tuples
> > > > > explain analyze select * from tbl where a < 200;
> > > > >
> > > > > -- seq scan returns 1 tuple
> > > > > set enable_seqscan to on;
> > > > > explain analyze select * from tbl;
> > > > >
> > > >
> > > > I am not sure if this is the right use-case of pg_force_freeze
> > > > function. I think we should only be running pg_force_freeze function
> > > > on a tuple for which VACUUM reports "found xmin ABC from before
> > > > relfrozenxid PQR" sort of error otherwise it might worsen the things
> > > > instead of making it better.
> > >
> > > Should this also be documented? I think that it's hard to force the
> > > user to always use this module in the right situation but we need to
> > > show at least when to use.
> > >
> >
> > I've already added some examples in the documentation explaining the
> > use-case of force_freeze function. If required, I will also add a note
> > about it.
> >
> > > > > Also, if a tuple updated and moved to another partition is revived by
> > > > > heap_force_freeze(), its ctid still has special values:
> > > > > MovedPartitionsOffsetNumber and MovedPartitionsBlockNumber. I don't
> > > > > see a problem yet caused by a visible tuple having the special ctid
> > > > > value, but it might be worth considering either to reset ctid value as
> > > > > well or to not freezing already-deleted tuple.
> > > > >
> > > >
> > > > For this as well, the answer remains the same as above.
> > >
> > > Perhaps the same is true when a tuple header is corrupted including
> > > xmin and ctid for some reason and the user wants to fix it? I'm
> > > concerned that a live tuple having the wrong ctid will cause SEGV or
> > > PANIC error in the future.
> > >
> >
> > If a tuple header itself is corrupted, then I think we must kill that
> > tuple. If only xmin and t_ctid fields are corrupted, then probably we
> > can think of resetting the ctid value of that tuple. However, it won't
> > be always possible to detect the corrupted ctid value. It's quite
> > possible that the corrupted ctid field has valid values for block
> > number and offset number in it, but it's actually corrupted and it
> > would be difficult to consider such ctid as corrupted. Hence, we can't
> > do anything about such types of corruption. Probably in such cases we
> > need to run VACUUM FULL on such tables so that new ctid gets generated
> > for each tuple in the table.
>
> Understood.
>
> Perhaps such corruption will be able to be detected by new heapam
> check functions discussed on another thread. My point was that it
> might be better to attempt making the tuple header sane state as much
> as possible when fixing a live tuple in order to prevent further
> problems such as databases crash by malicious attackers.
>

Agreed. So, to handle the ctid related concern that you raised, I'm
planning to do the following changes to ensure that the tuple being
marked as frozen contains the correct item pointer value. Please let
me know if you are okay with these changes.

                               HeapTupleHeader htup;
+                              ItemPointerData ctid;

                               Assert(heap_force_opt == HEAP_FORCE_FREEZE);

+                              ItemPointerSet(&ctid, blkno, offnos[i]);
+
                               htup = (HeapTupleHeader)
PageGetItem(page, itemid);

+                               /*
+                                * Make sure that this tuple holds the
correct item pointer
+                                * value.
+                                */
+                               if
(!HeapTupleHeaderIndicatesMovedPartitions(htup) &&
+                                   !ItemPointerEquals(&ctid, &htup->t_ctid))
+                                   ItemPointerSet(&htup->t_ctid,
blkno, offnos[i]);
+
                                HeapTupleHeaderSetXmin(htup,
FrozenTransactionId);
                                HeapTupleHeaderSetXmax(htup,
InvalidTransactionId);

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Optimising compactify_tuples()
Следующее
От: Rahila Syed
Дата:
Сообщение: Re: More tests with USING INDEX replident and dropped indexes