Обсуждение: [PATCH] xlogreader: do not read a file block twice

Поиск
Список
Период
Сортировка

[PATCH] xlogreader: do not read a file block twice

От
Arthur Zakirov
Дата:
Hello hackers,

Grigory noticed that one of our utilities has very slow performance when 
xlogreader reads zlib archives. We found out that xlogreader sometimes 
reads a WAL file block twice.

zlib has slow performance when you read an archive not in sequential 
order. I think reading a block twice in same position isn't sequential, 
because gzread() moves current position forward and next call gzseek() 
to the same position moves it back.

It seems that the attached patch solves the issue. I think when reqLen 
== state->readLen the requested block already is in the xlogreader's buffer.

What do you think?

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Вложения

Re: [PATCH] xlogreader: do not read a file block twice

От
Grigory Smolkin
Дата:
Hm, looks like it could speed up PostgreSQL recovery, but is it safe?


On 02/11/2019 07:25 PM, Arthur Zakirov wrote:
> Hello hackers,
>
> Grigory noticed that one of our utilities has very slow performance 
> when xlogreader reads zlib archives. We found out that xlogreader 
> sometimes reads a WAL file block twice.
>
> zlib has slow performance when you read an archive not in sequential 
> order. I think reading a block twice in same position isn't 
> sequential, because gzread() moves current position forward and next 
> call gzseek() to the same position moves it back.
>
> It seems that the attached patch solves the issue. I think when reqLen 
> == state->readLen the requested block already is in the xlogreader's 
> buffer.
>
> What do you think?
>

-- 
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] xlogreader: do not read a file block twice

От
Michael Paquier
Дата:
On Mon, Feb 11, 2019 at 07:32:59PM +0300, Grigory Smolkin wrote:
> Hm, looks like it could speed up PostgreSQL recovery, but is it
> safe?

(Please avoid top-posting.)

> On 02/11/2019 07:25 PM, Arthur Zakirov wrote:
>> Grigory noticed that one of our utilities has very slow performance when
>> xlogreader reads zlib archives. We found out that xlogreader sometimes
>> reads a WAL file block twice.
>>
>> What do you think?

I think that such things, even if they look simple, need a careful
lookup, and I have not looked at the proposal yet.  Could you add it
to the next commit fest so as we don't lose track of it?
https://commitfest.postgresql.org/22/
--
Michael

Вложения

Re: [PATCH] xlogreader: do not read a file block twice

От
Arthur Zakirov
Дата:
On 12.02.2019 07:23, Michael Paquier wrote:
>> On 02/11/2019 07:25 PM, Arthur Zakirov wrote:
>>> Grigory noticed that one of our utilities has very slow performance when
>>> xlogreader reads zlib archives. We found out that xlogreader sometimes
>>> reads a WAL file block twice.
>>>
>>> What do you think?
> 
> I think that such things, even if they look simple, need a careful
> lookup, and I have not looked at the proposal yet.  Could you add it
> to the next commit fest so as we don't lose track of it?
> https://commitfest.postgresql.org/22/

Of course. Agree, it may be a non trivial case. Added as a bug fix:
https://commitfest.postgresql.org/22/1994/

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [PATCH] xlogreader: do not read a file block twice

От
Andrey Lepikhov
Дата:

On 11.02.2019 21:25, Arthur Zakirov wrote:
> Hello hackers,
> 
> Grigory noticed that one of our utilities has very slow performance when 
> xlogreader reads zlib archives. We found out that xlogreader sometimes 
> reads a WAL file block twice.
> 
> zlib has slow performance when you read an archive not in sequential 
> order. I think reading a block twice in same position isn't sequential, 
> because gzread() moves current position forward and next call gzseek() 
> to the same position moves it back.
> 
> It seems that the attached patch solves the issue. I think when reqLen 
> == state->readLen the requested block already is in the xlogreader's 
> buffer.
> 
> What do you think?
I looked at the history of the code changes:

---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539) 
reqLen < state->readLen)

1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349) 
        targetPageOff == readOff && targetRecOff < readLen)

eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len) 
---------------------------------------------------------------

In the original code of Tom Lane, condition (total_len > len) caused a 
page reread from disk. As I understand it, this is equivalent to your 
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding 
line in commit 7fcbf6a405f but have another semantics: the targetPageOff 
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be. 
It may be a reason of appearance of possible mistake, introduced by 
commit 7fcbf6a405f.

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company


Re: [PATCH] xlogreader: do not read a file block twice

От
Arthur Zakirov
Дата:
On 12.02.2019 20:47, Andrey Lepikhov wrote:
> I looked at the history of the code changes:
> 
> ---------------------------------------------------------------
> 7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539) reqLen < 
> state->readLen)
> 
> 1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)        
> targetPageOff == readOff && targetRecOff < readLen)
> 
> eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
> len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
> 4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
> if (total_len > len) 
> ---------------------------------------------------------------
> 
> In the original code of Tom Lane, condition (total_len > len) caused a 
> page reread from disk. As I understand it, this is equivalent to your 
> proposal.
> Th code line in commit 1bb2558046c seems tantamount to the corresponding 
> line in commit 7fcbf6a405f but have another semantics: the targetPageOff 
> value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be. 
> It may be a reason of appearance of possible mistake, introduced by 
> commit 7fcbf6a405f.

Thank you for your research. Indeed, it makes sense now.

In my case after reading a page both `reqLen` and `state->readLen` equal 
to XLOG_BLCKSZ. This leads to a page reread, since `pageptr` is the same 
as the previous read. But `targetRecOff` is different in the second case 
because we want to read next record, which probably doesn't fit into the 
page wholly (that's why `reqLen` is equal to XLOG_BLCKSZ).

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [PATCH] xlogreader: do not read a file block twice

От
Michael Paquier
Дата:
On Tue, Feb 12, 2019 at 11:44:14AM +0300, Arthur Zakirov wrote:
> Of course. Agree, it may be a non trivial case. Added as a bug fix:
> https://commitfest.postgresql.org/22/1994/

I have been looking at the patch, and I agree that the current coding
is a bit crazy.  If the wanted data has already been read, it makes
little sense to require reading it again if the size requested by the
caller of ReadPageInternal() exactly equals what has been read
already, and that's what the code is doing.

Now I don't actually agree that this qualifies as a bug fix.  As
things stand, a page may finish by being more than once if what has
been read previously equals what is requested, however this does not
prevent the code to work correctly.  The performance gain is also
heavily dependent on the callback reading a page and the way the WAL
reader is used.  How do you actually read WAL pages in your own
plugin with compressed data?  It begins by reading a full page once,
then it moves on to a per-record read after making sure that the page
has been read?
--
Michael

Вложения

Re: [PATCH] xlogreader: do not read a file block twice

От
Arthur Zakirov
Дата:
On 14.02.2019 09:51, Michael Paquier wrote:
> Now I don't actually agree that this qualifies as a bug fix.  As
> things stand, a page may finish by being more than once if what has
> been read previously equals what is requested, however this does not
> prevent the code to work correctly.  The performance gain is also
> heavily dependent on the callback reading a page and the way the WAL
> reader is used.  How do you actually read WAL pages in your own
> plugin with compressed data?  It begins by reading a full page once,
> then it moves on to a per-record read after making sure that the page
> has been read?

Yes, an application reads WAL pages wholly at a time. It is done within 
SimpleXLogPageRead() (it is a read_page callback passed to 
XLogReaderAllocate()). It returns XLOG_BLCKSZ.

Here is the part of the code, not sure that it will be useful though:

SimpleXLogPageRead(...)
{
     ...
     targetPageOff = targetPagePtr % private_data->xlog_seg_size;
     ...
     if (gzseek(private_data->gz_xlogfile, (z_off_t) targetPageOff,
                SEEK_SET) == -1)
         ...
     if (gzread(private_data->gz_xlogfile, readBuf, XLOG_BLCKSZ) !=
                XLOG_BLCKSZ)
     ...
     return XLOG_BLCKSZ;
}

So we read whole page with size XLOG_BLCKSZ. The full code:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074

Here is the little optimization I made. Mainly I just add a buffer to 
store previous read page:
https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [PATCH] xlogreader: do not read a file block twice

От
Michael Paquier
Дата:
On Thu, Feb 14, 2019 at 11:20:56AM +0300, Arthur Zakirov wrote:
> So we read whole page with size XLOG_BLCKSZ. The full code:
> https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1074
>
> Here is the little optimization I made. Mainly I just add a buffer to store
> previous read page:
> https://github.com/postgrespro/pg_probackup/blob/c052651b8c8864733bcabbc2660c387b792229d8/src/parsexlog.c#L1046

Thanks, I see what you have done.  I cannot comment if your shortcut
is actually fully correct based on my knowledge of this code, but
things cannot be in the best conditions without having the WAL reader
handle properly the limits.  So I am planning to commit what you
propose after an extra pass on it in the next couple of days or so.
--
Michael

Вложения

Re: [PATCH] xlogreader: do not read a file block twice

От
Michael Paquier
Дата:
On Fri, Feb 15, 2019 at 08:06:16AM +0900, Michael Paquier wrote:
> Thanks, I see what you have done.  I cannot comment if your shortcut
> is actually fully correct based on my knowledge of this code, but
> things cannot be in the best conditions without having the WAL reader
> handle properly the limits.  So I am planning to commit what you
> propose after an extra pass on it in the next couple of days or so.

And done, after doing and extra pass, doing more testing using by own
plugins, pg_waldump and more fancy stuff with a primary/standby and
pgbench.
--
Michael

Вложения

Re: [PATCH] xlogreader: do not read a file block twice

От
Arthur Zakirov
Дата:
On 18.02.2019 04:09, Michael Paquier wrote:
> And done, after doing and extra pass, doing more testing using by own
> plugins, pg_waldump and more fancy stuff with a primary/standby and
> pgbench.

Thank you Michael.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company