Обсуждение: Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

Поиск
Список
Период
Сортировка

Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

От
Heikki Linnakangas
Дата:
Simon Riggs wrote:
> Introduce WAL records to log reuse of btree pages, allowing conflict
> resolution during Hot Standby. Page reuse interlock requested by Tom.
> Analysis and patch by me.

There's still a theoretical possibility for this to happen:

1. A page is marked as deleted by VACUUM, setting xact field in the opaque
2. Master crashes. WAL replay replays the XLOG_BTREE_DELETE_PAGE record.
It resets the xact field to FrozenTransactionId
3. The page is recycled. This writes a XLOG_BTREE_REUSE_PAGE record with
FrozenTransactionId as latestRemovedXid

When the standby replays that, it will call
ResolveRecoveryConflictWithSnapshot with FrozenTransactionid, not the
original xid that was used in the master when the page was deleted.

A straightforward way to fix that is to WAL-log the real xid in the
XLOG_BTREE_DELETE_PAGE records, instead of resetting it to
FrozenTransactionId.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

От
Simon Riggs
Дата:
On Thu, 2010-02-18 at 14:23 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Introduce WAL records to log reuse of btree pages, allowing conflict
> > resolution during Hot Standby. Page reuse interlock requested by Tom.
> > Analysis and patch by me.
> 
> There's still a theoretical possibility for this to happen:
> 
> 1. A page is marked as deleted by VACUUM, setting xact field in the opaque
> 2. Master crashes. WAL replay replays the XLOG_BTREE_DELETE_PAGE record.
> It resets the xact field to FrozenTransactionId
> 3. The page is recycled. This writes a XLOG_BTREE_REUSE_PAGE record with
> FrozenTransactionId as latestRemovedXid
> 
> When the standby replays that, it will call
> ResolveRecoveryConflictWithSnapshot with FrozenTransactionid, not the
> original xid that was used in the master when the page was deleted.

> A straightforward way to fix that is to WAL-log the real xid in the
> XLOG_BTREE_DELETE_PAGE records, instead of resetting it to
> FrozenTransactionId.

An even simpler way would be to reset the value to latestCompletedXid
during btree_xlog_delete_page(). That touches less code. I doubt it will
make much difference to conflict recovery, since if pages are being
deleted then btree delete records are likely to be frequent and will
have already killed long running queries.

-- Simon Riggs           www.2ndQuadrant.com



Re: Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

От
Tom Lane
Дата:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Thu, 2010-02-18 at 14:23 +0200, Heikki Linnakangas wrote:
>> A straightforward way to fix that is to WAL-log the real xid in the
>> XLOG_BTREE_DELETE_PAGE records, instead of resetting it to
>> FrozenTransactionId.

> An even simpler way would be to reset the value to latestCompletedXid
> during btree_xlog_delete_page(). That touches less code. I doubt it will
> make much difference to conflict recovery, since if pages are being
> deleted then btree delete records are likely to be frequent and will
> have already killed long running queries.

I'm a bit concerned about XID wraparound if the value doesn't get reset
to FrozenTransactionId.  There's no guarantee the page will get reused
promptly ...
        regards, tom lane


Re: Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

От
Simon Riggs
Дата:
On Thu, 2010-02-18 at 14:17 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Thu, 2010-02-18 at 14:23 +0200, Heikki Linnakangas wrote:
> >> A straightforward way to fix that is to WAL-log the real xid in the
> >> XLOG_BTREE_DELETE_PAGE records, instead of resetting it to
> >> FrozenTransactionId.
> 
> > An even simpler way would be to reset the value to latestCompletedXid
> > during btree_xlog_delete_page(). That touches less code. I doubt it will
> > make much difference to conflict recovery, since if pages are being
> > deleted then btree delete records are likely to be frequent and will
> > have already killed long running queries.
> 
> I'm a bit concerned about XID wraparound if the value doesn't get reset
> to FrozenTransactionId.  There's no guarantee the page will get reused
> promptly ...

I'd be very interested for you to have a look at Hot Standby from a
transaction wraparound perspective. There was some code in there to
handle anti-wraparound in RecordKnownAssignedTransactionId() but it was
removed, though I'm a little hazy on that myself. You've got the best
nose for corner cases and risks.

In this case, I don't see any problem. The xid after recovery will be a
same or higher value than if the crash had never taken place, so I can't
see any risk that isn't already addressed.

Since we now have to handle cases where blocks have been touched in
pre-9.0 code and are in a state they could never get into in 9.0, we do
still have to handle a value of btpo.xact == FrozenTransactionId. I will
add a special case to the handling of XLOG_BTREE_REUSE_PAGE records also
to allow for that.

Any similar theoretical issues would be most welcome if reported.

-- Simon Riggs           www.2ndQuadrant.com



Re: [COMMITTERS] pgsql: Introduce WAL records to log reuse of btree pages, allowing

От
Simon Riggs
Дата:
On Thu, 2010-02-18 at 14:23 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Introduce WAL records to log reuse of btree pages, allowing conflict
> > resolution during Hot Standby. Page reuse interlock requested by Tom.
> > Analysis and patch by me.
> 
> There's still a theoretical possibility for this to happen:
> 
> 1. A page is marked as deleted by VACUUM, setting xact field in the opaque
> 2. Master crashes. WAL replay replays the XLOG_BTREE_DELETE_PAGE record.
> It resets the xact field to FrozenTransactionId
> 3. The page is recycled. This writes a XLOG_BTREE_REUSE_PAGE record with
> FrozenTransactionId as latestRemovedXid
> 
> When the standby replays that, it will call
> ResolveRecoveryConflictWithSnapshot with FrozenTransactionid, not the
> original xid that was used in the master when the page was deleted.
> 
> A straightforward way to fix that is to WAL-log the real xid in the
> XLOG_BTREE_DELETE_PAGE records, instead of resetting it to
> FrozenTransactionId.

Bug accepted, proposal implemented and committed.

-- Simon Riggs           www.2ndQuadrant.com