Обсуждение: [PATCH] XLogReadRecord returns pointer to currently read page

Поиск
Список
Период
Сортировка

[PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:
Hi, hackers!

I propose the patch for fix one small code defect.
The XLogReadRecord() function reads the pages of a WAL segment that 
contain a WAL-record. Then it creates a readRecordBuf buffer in private 
memory of a backend and copy record from the pages to the readRecordBuf 
buffer. Pointer 'record' is set to the beginning of the readRecordBuf 
buffer.

But if the WAL-record is fully placed on one page, the XLogReadRecord() 
function forgets to bind the "record" pointer with the beginning of the 
readRecordBuf buffer. In this case, XLogReadRecord() returns a pointer 
to an internal xlog page. This patch fixes the defect.

Previously, in all cases of using WAL this was not a problem. However if 
you plan to perform some decoding operations before returning the WAL 
record to the caller (this is my case), this can lead to bugs that are 
difficult to catch.


-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Fri, Aug 17, 2018 at 08:47:15AM +0500, Andrey Lepikhov wrote:
> Previously, in all cases of using WAL this was not a problem. However if you
> plan to perform some decoding operations before returning the WAL record to
> the caller (this is my case), this can lead to bugs that are difficult to
> catch.

What kind of cases with logical decoding are you referring to?  If
any of them can be reproduced with upstream, could you send a
reproducer, or a way to see it?
--
Michael

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:

17.08.2018 08:55, Michael Paquier пишет:
> On Fri, Aug 17, 2018 at 08:47:15AM +0500, Andrey Lepikhov wrote:
>> Previously, in all cases of using WAL this was not a problem. However if you
>> plan to perform some decoding operations before returning the WAL record to
>> the caller (this is my case), this can lead to bugs that are difficult to
>> catch.
> 
> What kind of cases with logical decoding are you referring to?  If
> any of them can be reproduced with upstream, could you send a
> reproducer, or a way to see it?
> --
> Michael
> 

I'm working on the problem of a WAL-record size reducing. One of the 
PostgreSQL experimental extensions uses a 64-bit xid, and the XLogRecord 
size is increased to 32 bytes.
Representing a 64-bit xid in WAL Segment as a pair (16-bit base, 8-bit 
offset) reduces the size of the WAL record header and its body.
So, I encode a WAL-record in XLogInsertRecord() (decreasing the size) 
just before writing to a shared memory pages and performs the decoding 
just before processing DecodeXLogRecord() operation.
It looks like a layer in the xlog subsystem for additional WAL 
compression. However, this is a long story ...

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company


Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Heikki Linnakangas
Дата:
On 17/08/2018 06:47, Andrey Lepikhov wrote:
> I propose the patch for fix one small code defect.
> The XLogReadRecord() function reads the pages of a WAL segment that
> contain a WAL-record. Then it creates a readRecordBuf buffer in private
> memory of a backend and copy record from the pages to the readRecordBuf
> buffer. Pointer 'record' is set to the beginning of the readRecordBuf
> buffer.
> 
> But if the WAL-record is fully placed on one page, the XLogReadRecord()
> function forgets to bind the "record" pointer with the beginning of the
> readRecordBuf buffer. In this case, XLogReadRecord() returns a pointer
> to an internal xlog page. This patch fixes the defect.

Hmm. I agree it looks weird the way it is. But I wonder, why do we even 
copy the record to readRecordBuf, if it fits on the WAL page? Returning 
a pointer to the xlog page buffer seems OK in that case. What if we 
remove the memcpy(), instead?

- Heikki


Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:

On 22.10.2018 2:06, Heikki Linnakangas wrote:
> On 17/08/2018 06:47, Andrey Lepikhov wrote:
>> I propose the patch for fix one small code defect.
>> The XLogReadRecord() function reads the pages of a WAL segment that
>> contain a WAL-record. Then it creates a readRecordBuf buffer in private
>> memory of a backend and copy record from the pages to the readRecordBuf
>> buffer. Pointer 'record' is set to the beginning of the readRecordBuf
>> buffer.
>>
>> But if the WAL-record is fully placed on one page, the XLogReadRecord()
>> function forgets to bind the "record" pointer with the beginning of the
>> readRecordBuf buffer. In this case, XLogReadRecord() returns a pointer
>> to an internal xlog page. This patch fixes the defect.
> 
> Hmm. I agree it looks weird the way it is. But I wonder, why do we even 
> copy the record to readRecordBuf, if it fits on the WAL page? Returning 
> a pointer to the xlog page buffer seems OK in that case. What if we 
> remove the memcpy(), instead?

It depends on the PostgreSQL roadmap. If we want to compress main data 
and encode something to reduce a WAL-record size, than the 
representation of the WAL-record in shared buffers (packed) and in local 
memory (unpacked) will be different and the patch is needed.
If something like this should not be in the PostgreSQL core, then it is 
more efficient to remove memcpy(), of course.
I vote for the patch.

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company


Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Heikki Linnakangas
Дата:
On 22/10/2018 20:54, Andrey Lepikhov wrote:
> 
> 
> On 22.10.2018 2:06, Heikki Linnakangas wrote:
>> On 17/08/2018 06:47, Andrey Lepikhov wrote:
>>> I propose the patch for fix one small code defect.
>>> The XLogReadRecord() function reads the pages of a WAL segment that
>>> contain a WAL-record. Then it creates a readRecordBuf buffer in private
>>> memory of a backend and copy record from the pages to the readRecordBuf
>>> buffer. Pointer 'record' is set to the beginning of the readRecordBuf
>>> buffer.
>>>
>>> But if the WAL-record is fully placed on one page, the XLogReadRecord()
>>> function forgets to bind the "record" pointer with the beginning of the
>>> readRecordBuf buffer. In this case, XLogReadRecord() returns a pointer
>>> to an internal xlog page. This patch fixes the defect.
>>
>> Hmm. I agree it looks weird the way it is. But I wonder, why do we even
>> copy the record to readRecordBuf, if it fits on the WAL page? Returning
>> a pointer to the xlog page buffer seems OK in that case. What if we
>> remove the memcpy(), instead?
> 
> It depends on the PostgreSQL roadmap. If we want to compress main data
> and encode something to reduce a WAL-record size, than the
> representation of the WAL-record in shared buffers (packed) and in local
> memory (unpacked) will be different and the patch is needed.

I'd expect the decompression to read from the on-disk buffer, and unpack 
to readRecordBuf, I still don't see a need to copy the packed record to 
readRecordBuf. If there is a need for that, though, the patch that 
implements the packing or compression can add the memcpy() where it 
needs it.

- Heikki


Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:
On 23.10.2018 0:53, Heikki Linnakangas wrote:
> I'd expect the decompression to read from the on-disk buffer, and unpack 
> to readRecordBuf, I still don't see a need to copy the packed record to 
> readRecordBuf. If there is a need for that, though, the patch that 
> implements the packing or compression can add the memcpy() where it 
> needs it.

I agree with it. Eventually, placement of the WAL-record can be defined 
by comparison the record, readBuf and readRecordBuf pointers.
In attachment new version of the patch.

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Tue, 23 Oct 2018 10:25:27 +0500, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote in
<2553f2b0-0e39-eb0e-d382-6c0ed08ca8ae@postgrespro.ru>
> 
> On 23.10.2018 0:53, Heikki Linnakangas wrote:
> > I'd expect the decompression to read from the on-disk buffer, and
> > unpack to readRecordBuf, I still don't see a need to copy the packed
> > record to readRecordBuf. If there is a need for that, though, the
> > patch that implements the packing or compression can add the memcpy()
> > where it needs it.
> 
> I agree with it. Eventually, placement of the WAL-record can be
> defined by comparison the record, readBuf and readRecordBuf pointers.
> In attachment new version of the patch.

This looks quite clear and what it does is reasonable to me.
Applies cleanly on top of current master and no regression seen.


Just one comment. This patch leaves the following code.

 >        /* Record does not cross a page boundary */
 >        if (!ValidXLogRecord(state, record, RecPtr))
 >            goto err;
 >
 >        state->EndRecPtr = RecPtr + MAXALIGN(total_len);
!>
 >        state->ReadRecPtr = RecPtr;
 >    }

The empty line (marked by '!') looks a bit too much standing out
after this patch. Could you remove the line? Then could you
transpose the two assignments if you don't mind?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:

On 26.10.2018 10:33, Kyotaro HORIGUCHI wrote:
> Hello.
> 
> At Tue, 23 Oct 2018 10:25:27 +0500, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote in
<2553f2b0-0e39-eb0e-d382-6c0ed08ca8ae@postgrespro.ru>
>>
>> On 23.10.2018 0:53, Heikki Linnakangas wrote:
>>> I'd expect the decompression to read from the on-disk buffer, and
>>> unpack to readRecordBuf, I still don't see a need to copy the packed
>>> record to readRecordBuf. If there is a need for that, though, the
>>> patch that implements the packing or compression can add the memcpy()
>>> where it needs it.
>>
>> I agree with it. Eventually, placement of the WAL-record can be
>> defined by comparison the record, readBuf and readRecordBuf pointers.
>> In attachment new version of the patch.
> 
> This looks quite clear and what it does is reasonable to me.
> Applies cleanly on top of current master and no regression seen.
> 
> 
> Just one comment. This patch leaves the following code.
> 
>   >        /* Record does not cross a page boundary */
>   >        if (!ValidXLogRecord(state, record, RecPtr))
>   >            goto err;
>   >
>   >        state->EndRecPtr = RecPtr + MAXALIGN(total_len);
> !>
>   >        state->ReadRecPtr = RecPtr;
>   >    }
> 
> The empty line (marked by '!') looks a bit too much standing out
> after this patch. Could you remove the line? Then could you
> transpose the two assignments if you don't mind?

Thanks, see attachment.

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Kyotaro HORIGUCHI
Дата:
At Fri, 26 Oct 2018 11:43:14 +0500, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote in
<ea3dca6a-b898-e7a8-dcfa-b3ad227a6349@postgrespro.ru>
> 
> 
> On 26.10.2018 10:33, Kyotaro HORIGUCHI wrote:
> > Hello.
> > At Tue, 23 Oct 2018 10:25:27 +0500, Andrey Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote in
> > <2553f2b0-0e39-eb0e-d382-6c0ed08ca8ae@postgrespro.ru>
> >>
> >> On 23.10.2018 0:53, Heikki Linnakangas wrote:
> >>> I'd expect the decompression to read from the on-disk buffer, and
> >>> unpack to readRecordBuf, I still don't see a need to copy the packed
> >>> record to readRecordBuf. If there is a need for that, though, the
> >>> patch that implements the packing or compression can add the memcpy()
> >>> where it needs it.
> >>
> >> I agree with it. Eventually, placement of the WAL-record can be
> >> defined by comparison the record, readBuf and readRecordBuf pointers.
> >> In attachment new version of the patch.
> > This looks quite clear and what it does is reasonable to me.
> > Applies cleanly on top of current master and no regression seen.
> > Just one comment. This patch leaves the following code.
> >   >        /* Record does not cross a page boundary */
> >   >        if (!ValidXLogRecord(state, record, RecPtr))
> >   >            goto err;
> >   >
> >   >        state->EndRecPtr = RecPtr + MAXALIGN(total_len);
> > !>
> >   >        state->ReadRecPtr = RecPtr;
> >   >    }
> > The empty line (marked by '!') looks a bit too much standing out
> > after this patch. Could you remove the line? Then could you
> > transpose the two assignments if you don't mind?
> 
> Thanks, see attachment.

Thanks.

This patch eliminates unnecessary copying that was done for
non-continued records. Now the return value of XLogReadRecord
directly points into page buffer holded in XLogReaderStats. It is
safe because no caller site uses the returned pointer beyond the
replacement of buffer content at the next call to the same
function.


The code looks fine and correct.
This applied on HEAD@master cleanly.
Passed all regression tests.
Passed all tests under src/test/recovery

I marked this "Ready for Committer".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Thu, Nov 15, 2018 at 06:12:38PM +0900, Kyotaro HORIGUCHI wrote:
> This patch eliminates unnecessary copying that was done for
> non-continued records. Now the return value of XLogReadRecord
> directly points into page buffer holded in XLogReaderStats. It is
> safe because no caller site uses the returned pointer beyond the
> replacement of buffer content at the next call to the same
> function.

I was looking at this patch, and shouldn't we worry about compatibility
with plugins or utilities which look directly at what's in readRecordBuf
for the record contents?  Let's not forget that the contents of
XLogReaderState are public.
--
Michael

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Fri, Nov 16, 2018 at 03:23:55PM +0900, Michael Paquier wrote:
> I was looking at this patch, and shouldn't we worry about compatibility
> with plugins or utilities which look directly at what's in readRecordBuf
> for the record contents?  Let's not forget that the contents of
> XLogReaderState are public.

And a couple of days later, committed.  I did not notice first that the
comments of xlogreader.h mention that a couple of areas are private, and
readRecordBuf is part of that, so objection withdrawn.

There is a comment in xlog.c (on top of ReadRecord) referring to
readRecordBuf which has not been removed as part of 7fcbf6a4 when WAL
reading has been moved to its own facility.  The original comment was
from 0ffe11a.  So I have removed the comment at the same time.
--
Michael

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Kyotaro HORIGUCHI
Дата:
Thank you for taking this.

At Mon, 19 Nov 2018 10:27:29 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20181119012728.GA1578@paquier.xyz>
> On Fri, Nov 16, 2018 at 03:23:55PM +0900, Michael Paquier wrote:
> > I was looking at this patch, and shouldn't we worry about compatibility
> > with plugins or utilities which look directly at what's in readRecordBuf
> > for the record contents?  Let's not forget that the contents of
> > XLogReaderState are public.
> 
> And a couple of days later, committed.  I did not notice first that the
> comments of xlogreader.h mention that a couple of areas are private, and
> readRecordBuf is part of that, so objection withdrawn.

It wasn't changed the way the function replaces the content of
the buffer. (Regardless of whether it is private or not, I
believe no one tries to write directly there outside the
function.)  Also the memory pointed by the retuned pointer from
the previous code was regarded as read-only. The only point to
notice was the lifetime of the returned pointer, I suppose.

> There is a comment in xlog.c (on top of ReadRecord) referring to
> readRecordBuf which has not been removed as part of 7fcbf6a4 when WAL
> reading has been moved to its own facility.  The original comment was
> from 0ffe11a.  So I have removed the comment at the same time.

- * The record is copied into readRecordBuf, so that on successful return,
- * the returned record pointer always points there.

Yeah, it is incorrect in some sense, but the comment was
suggesting the lifetime of the pointed record. So I'd like to
have a comment like this instead.

+ * The record is copied into xlogreader, so that on successful return,
+ * the returned record pointer always points there.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Mon, Nov 19, 2018 at 12:28:06PM +0900, Kyotaro HORIGUCHI wrote:
> Yeah, it is incorrect in some sense, but the comment was
> suggesting the lifetime of the pointed record. So I'd like to
> have a comment like this instead.

I think that we can still live without as it is not the business of this
routine to be careful how the lifetime of a record read is handled, but
that's part of the internals of XLogReader.
--
Michael

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:

On 16.11.2018 11:23, Michael Paquier wrote:
> On Thu, Nov 15, 2018 at 06:12:38PM +0900, Kyotaro HORIGUCHI wrote:
>> This patch eliminates unnecessary copying that was done for
>> non-continued records. Now the return value of XLogReadRecord
>> directly points into page buffer holded in XLogReaderStats. It is
>> safe because no caller site uses the returned pointer beyond the
>> replacement of buffer content at the next call to the same
>> function.
> 
> I was looking at this patch, and shouldn't we worry about compatibility
> with plugins or utilities which look directly at what's in readRecordBuf
> for the record contents?  Let's not forget that the contents of
> XLogReaderState are public.

According to my experience, I clarify some comments to avoid this 
mistakes in the future (see attachment).

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Mon, Nov 19, 2018 at 10:48:06AM +0500, Andrey Lepikhov wrote:
> According to my experience, I clarify some comments to avoid this mistakes
> in the future (see attachment).

No objections from here.

> - * The returned pointer (or *errormsg) points to an internal buffer that's
> - * valid until the next call to XLogReadRecord.
> + * The returned pointer (or *errormsg) points to an internal read-only buffer
> + * that's valid until the next call to XLogReadRecord.

Not sure that this bit adds much.

> -    /* Buffer for current ReadRecord result (expandable) */
> +    /*
> +     * Buffer for current ReadRecord result (expandable).
> +     * Used in the case, than current ReadRecord result don't fit on the
> +     * currently read WAL page.
> +     */

Yeah, this one is not entirely true now.  What about the following
instead:
-   /* Buffer for current ReadRecord result (expandable) */
+   /*
+    * Buffer for current ReadRecord result (expandable), used when a record
+    * crosses a page boundary.
+    */
--
Michael

Вложения

Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Andrey Lepikhov
Дата:

On 20.11.2018 6:30, Michael Paquier wrote:
> On Mon, Nov 19, 2018 at 10:48:06AM +0500, Andrey Lepikhov wrote:
>> According to my experience, I clarify some comments to avoid this mistakes
>> in the future (see attachment).
> 
> No objections from here.
> 
>> - * The returned pointer (or *errormsg) points to an internal buffer that's
>> - * valid until the next call to XLogReadRecord.
>> + * The returned pointer (or *errormsg) points to an internal read-only buffer
>> + * that's valid until the next call to XLogReadRecord.
> 
> Not sure that this bit adds much.
Ok
> 
>> -    /* Buffer for current ReadRecord result (expandable) */
>> +    /*
>> +     * Buffer for current ReadRecord result (expandable).
>> +     * Used in the case, than current ReadRecord result don't fit on the
>> +     * currently read WAL page.
>> +     */
> 
> Yeah, this one is not entirely true now.  What about the following
> instead:
> -   /* Buffer for current ReadRecord result (expandable) */
> +   /*
> +    * Buffer for current ReadRecord result (expandable), used when a record
> +    * crosses a page boundary.
> +    */

I agree


-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company


Re: [PATCH] XLogReadRecord returns pointer to currently read page

От
Michael Paquier
Дата:
On Tue, Nov 20, 2018 at 06:37:36AM +0500, Andrey Lepikhov wrote:
> On 20.11.2018 6:30, Michael Paquier wrote:
>> Yeah, this one is not entirely true now.  What about the following
>> instead:
>> -   /* Buffer for current ReadRecord result (expandable) */
>> +   /*
>> +    * Buffer for current ReadRecord result (expandable), used when a record
>> +    * crosses a page boundary.
>> +    */
>
> I agree

Okay, I have committed this version, after checking that there were no
other spots.
--
Michael

Вложения