Обсуждение: Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

Поиск
Список
Период
Сортировка

Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

От
Heikki Linnakangas
Дата:
On 01/03/2015 08:59 PM, Andres Freund wrote:
> Hi Heikki,
>
> While writing a test script for
> http://archives.postgresql.org/message-id/20141205002854.GE21964%40awork2.anarazel.de
> I noticed that this commit broke starting a pg_basebackup -X * without a
> recovery.conf present. Which might not be the best idea, but imo is a
> perfectly valid thing to do.
>
> To me the changes to StartupXLOG() in that commit look a bit bogus. The
> new startLogSegNo is initialized to XLByteToSeg(EndOfLog)? Which points
> to the end of the record +1? Which thus isn't guaranteed to exist as a
> segment (e.g. never if the last record was a XLOG_SWITCH).

Ah, good point.

> Did you perhaps intend to use XLogFileInit(use_existing = true)
> instead of XLogFileOpen()? That works for me.

Hmm, that doesn't sound right either. XLogFileInit is used when you
switch to a new segment, not to open an old segment for writing. It
happens to work, because with use_existing = true it will in fact always
open the old segment, instead of creating a new one, but I don't think
that's in the spirit of how that function's intended to be used.

A very simple fix is to not try opening the segment at all. There isn't
actually any requirement to have the segment open at that point,
XLogWrite() will open it the first time the WAL is flushed. The WAL is
flushed after writing the initial checkpoint or end-of-recovery record,
which happens pretty soon anyway. Any objections to the attached?

> I've attached my preliminary testscript (note it's really not that
> interesting at this point) that reliably reproduces the problem for me.

Thanks!

- Heikki

Вложения

Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

От
Andres Freund
Дата:
On 2015-01-05 18:45:22 +0200, Heikki Linnakangas wrote:
> On 01/03/2015 08:59 PM, Andres Freund wrote:
> >Hi Heikki,
> >
> >While writing a test script for
> >http://archives.postgresql.org/message-id/20141205002854.GE21964%40awork2.anarazel.de
> >I noticed that this commit broke starting a pg_basebackup -X * without a
> >recovery.conf present. Which might not be the best idea, but imo is a
> >perfectly valid thing to do.
> >
> >To me the changes to StartupXLOG() in that commit look a bit bogus. The
> >new startLogSegNo is initialized to XLByteToSeg(EndOfLog)? Which points
> >to the end of the record +1? Which thus isn't guaranteed to exist as a
> >segment (e.g. never if the last record was a XLOG_SWITCH).
> 
> Ah, good point.
> 
> >Did you perhaps intend to use XLogFileInit(use_existing = true)
> >instead of XLogFileOpen()? That works for me.
> 
> Hmm, that doesn't sound right either. XLogFileInit is used when you switch
> to a new segment, not to open an old segment for writing. It happens to
> work, because with use_existing = true it will in fact always open the old
> segment, instead of creating a new one, but I don't think that's in the
> spirit of how that function's intended to be used.

Well, its docs say "Create a new XLOG file segment, or open a
pre-existing one.", so it's not that mismatched. We really don't know
whether the EndOfLog's segment already exist in this scenario.  Also,
doesn't XLogWrite() essentially use it in the same way?

> A very simple fix is to not try opening the segment at all. There isn't
> actually any requirement to have the segment open at that point, XLogWrite()
> will open it the first time the WAL is flushed. The WAL is flushed after
> writing the initial checkpoint or end-of-recovery record, which happens
> pretty soon anyway. Any objections to the attached?

Sounds like a good plan.

> >I've attached my preliminary testscript (note it's really not that
> >interesting at this point) that reliably reproduces the problem for me.
> 
> Thanks!

I've attached attached, for posterities sake, the last version of that
script. I think we should have at least some of these tests in the tap
tests. I'd not used that framework because I wanted to test back to 9.1,
but ...

> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> index 5cc7e47..e623463 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -5646,7 +5646,6 @@ StartupXLOG(void)
>      XLogRecPtr    RecPtr,
>                  checkPointLoc,
>                  EndOfLog;
> -    XLogSegNo    startLogSegNo;
>      TimeLineID    PrevTimeLineID;
>      XLogRecord *record;
>      TransactionId oldestActiveXID;
> @@ -6633,7 +6632,6 @@ StartupXLOG(void)
>       */
>      record = ReadRecord(xlogreader, LastRec, PANIC, false);
>      EndOfLog = EndRecPtr;
> -    XLByteToSeg(EndOfLog, startLogSegNo);
>  
>      /*
>       * Complain if we did not roll forward far enough to render the backup
> @@ -6741,9 +6739,6 @@ StartupXLOG(void)
>       * buffer cache using the block containing the last record from the
>       * previous incarnation.
>       */
> -    openLogSegNo = startLogSegNo;
> -    openLogFile = XLogFileOpen(openLogSegNo);
> -    openLogOff = 0;
>      Insert = &XLogCtl->Insert;
>      Insert->PrevBytePos = XLogRecPtrToBytePos(LastRec);
>      Insert->CurrBytePos = XLogRecPtrToBytePos(EndOfLog);

The comment above could use some minor word smithing - the second part
of it doesn't really seem to belong there.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

От
Robert Haas
Дата:
On Tue, Jan 6, 2015 at 8:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> I've attached attached, for posterities sake, the last version of that
> script. I think we should have at least some of these tests in the tap
> tests. I'd not used that framework because I wanted to test back to 9.1,
> but ...

I don't see an attachment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

От
Andres Freund
Дата:
On 2015-01-06 13:46:57 -0500, Robert Haas wrote:
> On Tue, Jan 6, 2015 at 8:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > I've attached attached, for posterities sake, the last version of that
> > script. I think we should have at least some of these tests in the tap
> > tests. I'd not used that framework because I wanted to test back to 9.1,
> > but ...
>
> I don't see an attachment.

Oops.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [COMMITTERS] pgsql: Change how first WAL segment on new timeline after promotion is

От
Heikki Linnakangas
Дата:
On 01/06/2015 03:22 PM, Andres Freund wrote:
> On 2015-01-05 18:45:22 +0200, Heikki Linnakangas wrote:
>> On 01/03/2015 08:59 PM, Andres Freund wrote:
>>> Did you perhaps intend to use XLogFileInit(use_existing = true)
>>> instead of XLogFileOpen()? That works for me.
>>
>> Hmm, that doesn't sound right either. XLogFileInit is used when you switch
>> to a new segment, not to open an old segment for writing. It happens to
>> work, because with use_existing = true it will in fact always open the old
>> segment, instead of creating a new one, but I don't think that's in the
>> spirit of how that function's intended to be used.
>
> Well, its docs say "Create a new XLOG file segment, or open a
> pre-existing one.", so it's not that mismatched. We really don't know
> whether the EndOfLog's segment already exist in this scenario.  Also,
> doesn't XLogWrite() essentially use it in the same way?

XLogWrite() is smarter. It uses XLogFileInit() when switching to a new 
segment, and XLogFileOpen when writing to the middle of a segment.

Committed the fix to not open the segment at all.

- Heikki