Обсуждение: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Anastasia Lubennikova
Дата:
Hi, I want to propose a bunch of patches which allow to reduce WAL traffic generated by CREATE INDEX for GiST, GIN and SP-GiST. Similarly to b-tree and RUM, we can now log index pages of other access methods only once in the end of indexbuild process. Implementation is based on generic_xlog. Not only it decreases the amount of WAL generated, but also completely eliminates WAL overhead in case of error during index build. I also attached sql scripts which I used to measure xlog size. They show that pg_wal_lsn_diff for patched version is from 3 to 5 times smaller. Not sure if regression tests are needed, since it is just an optimization. But I do not mind to add them if someone feels that it is necessary. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
Hi Anastasia, On 2/28/18 11:03 AM, Anastasia Lubennikova wrote: > I want to propose a bunch of patches which allow to reduce WAL traffic > generated by CREATE INDEX for GiST, GIN and SP-GiST. Similarly to b-tree > and RUM, we can now log index pages of other access methods only once > in the end of indexbuild process. Implementation is based on generic_xlog. > > Not only it decreases the amount of WAL generated, but also completely > eliminates WAL overhead in case of error during index build. This seems to be a worthwhile patch, but it was submitted to the 2018-03 CF at the last moment with no prior discussion or review as far as I can tell. It appears to be non-trivial and therefore not a good fit for the last CF for PG11. I have moved it to the 2018-09 CF. Regards, -- -David david@pgmasters.net
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Heikki Linnakangas
Дата:
On 28/02/18 18:03, Anastasia Lubennikova wrote: > Hi, > > I want to propose a bunch of patches which allow to reduce WAL traffic > generated by CREATE INDEX for GiST, GIN and SP-GiST. Similarly to b-tree > and RUM, we can now log index pages of other access methods only once > in the end of indexbuild process. Makes sense. Is there any scenario where the current method is better? In theory, doing another pass through the whole index, to create the WAL records, requires more I/O. But in practice, it seems that the reduction in WAL is almost certainly a win. > Implementation is based on generic_xlog. Why? I think we should just add a log_relation() function in xloginsert.c directly, alongside log_newpage_buffer(). This makes the assumption that all the pages in these indexes used the standard page layout. I think that's a valid assumption, but needs at least a comment on that. And perhaps an Assert, to check that pd_lower/upper look sane. As a further optimization, would it be a win to WAL-log multiple pages in each record? This leaves the XLOG_*_CREATE_INDEX WAL record types unused, BTW. - Heikki
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist
От
Alexander Korotkov
Дата:
On Tue, Jul 10, 2018 at 10:03 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 28/02/18 18:03, Anastasia Lubennikova wrote: > > I want to propose a bunch of patches which allow to reduce WAL traffic > > generated by CREATE INDEX for GiST, GIN and SP-GiST. Similarly to b-tree > > and RUM, we can now log index pages of other access methods only once > > in the end of indexbuild process. > > Makes sense. Is there any scenario where the current method is better? > In theory, doing another pass through the whole index, to create the WAL > records, requires more I/O. But in practice, it seems that the reduction > in WAL is almost certainly a win. It's frequently advised to move WAL to the separate storage device. Then, it's not hard to imagine how current method can be faster, when storage device containing index is much more loaded than storage device containing WAL. But despite being faster it would still consume way more overall I/O resources. I'm not sure if we should provide some option to switch between WAL methods (or heuristics)... ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
With the consent of Anastasia I will improving this patch further. Attachment contains next version of the patch set. 11.07.2018 00:03, Heikki Linnakangas пишет: > On 28/02/18 18:03, Anastasia Lubennikova wrote: >> Implementation is based on generic_xlog. > > Why? I think we should just add a log_relation() function in > xloginsert.c directly, alongside log_newpage_buffer(). > I have some arguments to stay this functionality at generic_xlog module: 1. xloginsert.c functions work on low level of abstraction, use buffers and pages. 2. Code size using generic_xlog service functions looks more compact and safe. > This makes the assumption that all the pages in these indexes used the > standard page layout. I think that's a valid assumption, but needs at > least a comment on that. And perhaps an Assert, to check that > pd_lower/upper look sane. Done > > As a further optimization, would it be a win to WAL-log multiple pages > in each record? In this version of the patch we use simple optimization: pack XLR_NORMAL_MAX_BLOCK_ID blocks pieces into each WAL-record. > > This leaves the XLOG_*_CREATE_INDEX WAL record types unused, BTW. > Done > - Heikki > Benchmarks: ----------- Test: pgbench -f gin-WAL-test.sql -t 5: --------------------------------------- master: Latency average: 27696.299 ms WAL size: 2.66 GB patched Latency average: 22812.103 ms WAL size: 1.23 GB Test: pgbench -f gist-WAL-test.sql -t 5: ---------------------------------------- master: Latency average: 19928.284 ms WAL size: 1.25 GB patched Latency average: 18175.064 ms WAL size: 0.63 GB Test: pgbench -f spgist-WAL-test.sql -t 5: ------------------------------------------ master: Latency average: 11529.384 ms WAL size: 1.07 GB patched Latency average: 9330.828 ms WAL size: 0.6 GB -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Вложения
> On Tue, Jul 31, 2018 at 6:36 AM Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote: > > With the consent of Anastasia I will improving this patch further. > Attachment contains next version of the patch set. Thank you. I played a bit with this patch, and can confirm visible WAL size reduction (it's rather obvious, but still). Although, probably due to lack of my knowledge here, I wonder how does it work when an index is build concurrently? > Benchmarks: > ----------- > > Test: pgbench -f gin-WAL-test.sql -t 5: > --------------------------------------- > master: > Latency average: 27696.299 ms > WAL size: 2.66 GB Does it makes sense to measure latency of the entire script, since it contains also some preparation work? Of course it still shows the difference between patched version and master, but probably in a more noisy way. I'm moving this patch to the next CF.
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
On 30.11.2018 15:10, Dmitry Dolgov wrote: > Thank you. I played a bit with this patch, and can confirm visible WAL size > reduction (it's rather obvious, but still). Although, probably due to lack of > my knowledge here, I wonder how does it work when an index is build > concurrently? Routine generate_xlog_for_rel() works only at phase 2 of concurrent index building. At the phase 3, during validate_index() execution we use aminsert() -> PlaceToPage() mechanism as before the patch. In the concurrent mode, I do not expect any problems, caused by the patch. > >> Benchmarks: >> ----------- >> >> Test: pgbench -f gin-WAL-test.sql -t 5: >> --------------------------------------- >> master: >> Latency average: 27696.299 ms >> WAL size: 2.66 GB > > Does it makes sense to measure latency of the entire script, since it contains > also some preparation work? Of course it still shows the difference between > patched version and master, but probably in a more noisy way. Ok. It is used only for demonstration. -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Michael Paquier
Дата:
On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: > Ok. It is used only for demonstration. The latest patch set needs a rebase, so moved to next CF, waiting on author as this got no reviews. -- Michael
Вложения
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
On 04.02.2019 10:04, Michael Paquier wrote: > On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: >> Ok. It is used only for demonstration. > > The latest patch set needs a rebase, so moved to next CF, waiting on > author as this got no reviews. The new version in attachment. > -- > Michael > -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Вложения
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
The patchset had a problem with all-zero pages, has appeared at index build stage: the generic_log_relation() routine sends all pages into the WAL. So lsn field at all-zero page was initialized and the PageIsVerified() routine detects it as a bad page. The solution may be: 1. To improve index build algorithms and eliminate the possibility of not used pages appearing. 2. To mark each page as 'dirty' right after initialization. In this case we will got 'empty' page instead of the all-zeroed. 3. Do not write into the WAL all-zero pages. In the patchset (see attachment) I used approach No.3. On 04.02.2019 10:04, Michael Paquier wrote: > On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: >> Ok. It is used only for demonstration. > > The latest patch set needs a rebase, so moved to next CF, waiting on > author as this got no reviews. > -- > Michael > -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Вложения
Re: Re: Reduce amount of WAL generated by CREATE INDEX for gist, ginand sp-gist
От
David Steele
Дата:
On 2/6/19 2:08 PM, Andrey Lepikhov wrote: > The patchset had a problem with all-zero pages, has appeared at index > build stage: the generic_log_relation() routine sends all pages into the > WAL. So lsn field at all-zero page was initialized and the > PageIsVerified() routine detects it as a bad page. > The solution may be: > 1. To improve index build algorithms and eliminate the possibility of > not used pages appearing. > 2. To mark each page as 'dirty' right after initialization. In this case > we will got 'empty' page instead of the all-zeroed. > 3. Do not write into the WAL all-zero pages. > > In the patchset (see attachment) I used approach No.3. > > On 04.02.2019 10:04, Michael Paquier wrote: >> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: >>> Ok. It is used only for demonstration. >> >> The latest patch set needs a rebase, so moved to next CF, waiting on >> author as this got no reviews. The patch no longer applies so marked Waiting on Author. Alexander, Heikki, are either of you planning to review the patch in this CF? Regards, -- -David david@pgmasters.net
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Heikki Linnakangas
Дата:
On 25/03/2019 09:57, David Steele wrote: > On 2/6/19 2:08 PM, Andrey Lepikhov wrote: >> The patchset had a problem with all-zero pages, has appeared at index >> build stage: the generic_log_relation() routine sends all pages into the >> WAL. So lsn field at all-zero page was initialized and the >> PageIsVerified() routine detects it as a bad page. >> The solution may be: >> 1. To improve index build algorithms and eliminate the possibility of >> not used pages appearing. >> 2. To mark each page as 'dirty' right after initialization. In this case >> we will got 'empty' page instead of the all-zeroed. >> 3. Do not write into the WAL all-zero pages. Hmm. When do we create all-zero pages during index build? That seems pretty surprising. >> On 04.02.2019 10:04, Michael Paquier wrote: >>> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: >>>> Ok. It is used only for demonstration. >>> >>> The latest patch set needs a rebase, so moved to next CF, waiting on >>> author as this got no reviews. > > The patch no longer applies so marked Waiting on Author. > > Alexander, Heikki, are either of you planning to review the patch in > this CF? I had another quick look. I still think using the "generic xlog AM" for this is a wrong level of abstraction, and we should use the XLOG_FPI records for this directly. We can extend XLOG_FPI so that it can store multiple pages in a single record, if it doesn't already handle it. Another counter-point to using the generic xlog record is that you're currently doing unnecessary two memcpy's of all pages in the index, in GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free. I guess the generic_log_relation() function can stay where it is, but it should use XLogRegisterBuffer() and XLogInsert() directly. - Heikki
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
On 25/03/2019 15:21, Heikki Linnakangas wrote: > On 25/03/2019 09:57, David Steele wrote: >> On 2/6/19 2:08 PM, Andrey Lepikhov wrote: >>> The patchset had a problem with all-zero pages, has appeared at index >>> build stage: the generic_log_relation() routine sends all pages into the >>> WAL. So lsn field at all-zero page was initialized and the >>> PageIsVerified() routine detects it as a bad page. >>> The solution may be: >>> 1. To improve index build algorithms and eliminate the possibility of >>> not used pages appearing. >>> 2. To mark each page as 'dirty' right after initialization. In this case >>> we will got 'empty' page instead of the all-zeroed. >>> 3. Do not write into the WAL all-zero pages. > > Hmm. When do we create all-zero pages during index build? That seems > pretty surprising. GIST uses buffered pages. During GIST build it is possible (very rarely) what no one index tuple was written to the block page before new block was allocated. And the page has become an all-zero page. You can't have problems in the current GIST code, because it writes into the WAL only changed pages. But the idea of the patch is traversing blocks of index relation one-by-one after end of index building process and write this blocks to the WAL. In this case we will see all-zeroed pages and need to check it. > >>> On 04.02.2019 10:04, Michael Paquier wrote: >>>> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote: >>>>> Ok. It is used only for demonstration. >>>> >>>> The latest patch set needs a rebase, so moved to next CF, waiting on >>>> author as this got no reviews. >> >> The patch no longer applies so marked Waiting on Author. >> >> Alexander, Heikki, are either of you planning to review the patch in >> this CF? > > I had another quick look. > > I still think using the "generic xlog AM" for this is a wrong level of > abstraction, and we should use the XLOG_FPI records for this directly. > We can extend XLOG_FPI so that it can store multiple pages in a single > record, if it doesn't already handle it. > > Another counter-point to using the generic xlog record is that you're > currently doing unnecessary two memcpy's of all pages in the index, in > GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free. > > I guess the generic_log_relation() function can stay where it is, but it > should use XLogRegisterBuffer() and XLogInsert() directly. Ok. This patch waited feedback for a long time. I will check the GIST code changes from previous review and will try to use your advice. > > - Heikki -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Heikki Linnakangas
Дата:
On 26/03/2019 11:29, Andrey Lepikhov wrote: > On 25/03/2019 15:21, Heikki Linnakangas wrote: >> Hmm. When do we create all-zero pages during index build? That seems >> pretty surprising. > > GIST uses buffered pages. During GIST build it is possible (very rarely) > what no one index tuple was written to the block page before new block > was allocated. And the page has become an all-zero page. > You can't have problems in the current GIST code, because it writes into > the WAL only changed pages. Looking at the code, I don't see how that could happen. The only place where the GiST index file is extended is in gistNewBuffer(), and all callers of that initialize the page immediately after the call. What am I missing? - Heikki
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
On 26/03/2019 15:59, Heikki Linnakangas wrote: > On 26/03/2019 11:29, Andrey Lepikhov wrote: >> On 25/03/2019 15:21, Heikki Linnakangas wrote: >>> Hmm. When do we create all-zero pages during index build? That seems >>> pretty surprising. >> >> GIST uses buffered pages. During GIST build it is possible (very rarely) >> what no one index tuple was written to the block page before new block >> was allocated. And the page has become an all-zero page. >> You can't have problems in the current GIST code, because it writes into >> the WAL only changed pages. > > Looking at the code, I don't see how that could happen. The only place > where the GiST index file is extended is in gistNewBuffer(), and all > callers of that initialize the page immediately after the call. What am > I missing? Sorry, This issue was found in SP-GiST AM. You can show it: 1. Apply v2 version of the patch set (see attachment). 2. In the generic_log_relation() routine set logging on PageIsNew(buf) 3. Run script t1.sql (in attachment). This problem can be resolved by calling MarkBufferDirty() after SpGistInitBuffer() in the allocNewBuffer() routine. But in this case we will write to the WAL more pages than necessary. To avoid it in the patch '0001-Relation-into-WAL-function' I do not write new pages to the WAL. Attached patch set is not final version. It is needed for demonstration of 'all-zero pages' issue only. The sentence for the direct use of XLOG_FPI records will be considered in v3. -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Вложения
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Andrey Lepikhov
Дата:
On 25/03/2019 15:21, Heikki Linnakangas wrote: > I had another quick look. > > I still think using the "generic xlog AM" for this is a wrong level of > abstraction, and we should use the XLOG_FPI records for this directly. > We can extend XLOG_FPI so that it can store multiple pages in a single > record, if it doesn't already handle it. > > Another counter-point to using the generic xlog record is that you're > currently doing unnecessary two memcpy's of all pages in the index, in > GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free. > > I guess the generic_log_relation() function can stay where it is, but it > should use XLogRegisterBuffer() and XLogInsert() directly. Patch set v.3 uses XLOG_FPI records directly. As a benchmark I use the script (test.sql in attachment) which show WAL size increment during index build. In the table below you can see the influence of the patch on WAL growth. Results ======= AM | master | patch | GIN | 347 MB | 66 MB | GiST | 157 MB | 43 MB | SP-GiST | 119 MB | 38 MB | -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Вложения
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
От
Heikki Linnakangas
Дата:
On 02/04/2019 08:58, Andrey Lepikhov wrote: > On 25/03/2019 15:21, Heikki Linnakangas wrote: >> I had another quick look. >> >> I still think using the "generic xlog AM" for this is a wrong level of >> abstraction, and we should use the XLOG_FPI records for this directly. >> We can extend XLOG_FPI so that it can store multiple pages in a single >> record, if it doesn't already handle it. >> >> Another counter-point to using the generic xlog record is that you're >> currently doing unnecessary two memcpy's of all pages in the index, in >> GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free. >> >> I guess the generic_log_relation() function can stay where it is, but it >> should use XLogRegisterBuffer() and XLogInsert() directly. > > Patch set v.3 uses XLOG_FPI records directly. Thanks! Committed, with some changes: * I moved the log_relation() function to xlog.c, so that it sits beside log_newpage_*() functions. I renamed it to log_newpage_range(), and changed the argument so that the caller provides the beginning and end block ranges. I added a 'page_std' flag, instead of just assuming that all pages use the standard page layout. All of the callers pass page_std=true at the moment, but seems better to be explicit about it. I made those changes because I felt that the function was too narrowly tailored for the current callers. The assumption about standard page layout, and also the fact that it only logged the main fork. It's more flexible now, for any future AMs that might not be exactly like that. It feels like it's at the same level of abstraction now as the other log_newpage_*() functions. Even if we never need the flexibility, I think making the 'page_std' and 'forknum' arguments explicit is good, to draw attention to those details, for anyone calling the function. * I fixed the REDO code. It was trivially broken, it only restored the first page in each FPI WAL record. * Using "fake" unlogged LSNs for GiST index build seemed fishy. I could not convince myself that it was safe in all corner cases. In a recently initdb'd cluster, it's theoretically possible that the fake LSN counter overtakes the real LSN value, and that could lead to strange behavior. For example, how would the buffer manager behave, if there was a dirty page in the buffer cache with an LSN value that's greater than the current WAL flush pointer? I think you'd get "ERROR: xlog flush request %X/%X is not satisfied --- flushed only to %X/%X". I changed that so that we use LSN 1 for all pages during index build. That's smaller than any real or fake LSN. Or actually, the fake LSN counter used to start at 1 - I bumped that up to 1000, so now it's safely smaller. Could've used 0 instead, but there's an assertion in gist scan code that didn't like that. > As a benchmark I use the script (test.sql in attachment) which show WAL > size increment during index build. In the table below you can see the > influence of the patch on WAL growth. > > Results > ======= > AM | master | patch | > GIN | 347 MB | 66 MB | > GiST | 157 MB | 43 MB | > SP-GiST | 119 MB | 38 MB | Nice! - Heikki