Re: Improve WALRead() to suck data directly from WAL buffers when possible

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Improve WALRead() to suck data directly from WAL buffers when possible
Дата
Msg-id CALj2ACV+pmifCOktKrr+nYuZSo1R-Ue2TA-iuzTGwpHGw=EQJg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Sun, Nov 5, 2023 at 2:57 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Sat, 2023-11-04 at 20:55 +0530, Bharath Rupireddy wrote:
> > +             XLogRecPtr      EndPtr =
> > pg_atomic_read_u64(&XLogCtl->xlblocks[curridx]);
> > +
> > +             /*
> > +              * xlblocks value can be InvalidXLogRecPtr before
> > the new WAL buffer
> > +              * page gets initialized in AdvanceXLInsertBuffer.
> > In such a case
> > +              * re-read the xlblocks value under the lock to
> > ensure the correct
> > +              * value is read.
> > +              */
> > +             if (unlikely(XLogRecPtrIsInvalid(EndPtr)))
> > +             {
> > +                     LWLockAcquire(WALBufMappingLock,
> > LW_EXCLUSIVE);
> > +                     EndPtr = pg_atomic_read_u64(&XLogCtl-
> > >xlblocks[curridx]);
> > +                     LWLockRelease(WALBufMappingLock);
> > +             }
> > +
> > +             Assert(!XLogRecPtrIsInvalid(EndPtr));
>
> Can that really happen? If the EndPtr is invalid, that means the page
> is in the process of being cleared, so the contents of the page are
> undefined at that time, right?

My initial thoughts were this way - xlblocks is being read without
holding WALBufMappingLock in XLogWrite() and since we write
InvalidXLogRecPtr to xlblocks array elements temporarily before
MemSet-ting the page in AdvanceXLInsertBuffer(), the PANIC "xlog write
request %X/%X is past end of log %X/%X" might get hit if EndPtr read
from xlblocks is InvalidXLogRecPtr. FWIW, an Assert(false); within the
if (unlikely(XLogRecPtrIsInvalid(EndPtr))) block didn't hit in make
check-world.

It looks like my above understanding isn't correct because it can
never happen that the page that's being written to the WAL file gets
initialized in AdvanceXLInsertBuffer(). I'll remove this piece of code
in next version of the patch unless there are any other thoughts.

[1]
    /*
     * Within the loop, curridx is the cache block index of the page to
     * consider writing.  Begin at the buffer containing the next unwritten
     * page, or last partially written page.
     */
    curridx = XLogRecPtrToBufIdx(LogwrtResult.Write);

    while (LogwrtResult.Write < WriteRqst.Write)
    {
        /*
         * Make sure we're not ahead of the insert process.  This could happen
         * if we're passed a bogus WriteRqst.Write that is past the end of the
         * last page that's been initialized by AdvanceXLInsertBuffer.
         */
        XLogRecPtr  EndPtr = pg_atomic_read_u64(&XLogCtl->xlblocks[curridx]);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pg_upgrade and logical replication