Обсуждение: pgsql: Skip full index scan during cleanup of B-tree indexes whenpossi
Skip full index scan during cleanup of B-tree indexes when possible Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete calls and one amvacuumcleanup call. When workload on particular table is append-only, then autovacuum isn't intended to touch this table. However, user may run vacuum manually in order to fill visibility map and get benefits of index-only scans. Then ambulkdelete wouldn't be called for indexes of such table (because no heap tuples were deleted), only amvacuumcleanup would be called In this case, amvacuumcleanup would perform full index scan for two objectives: put recyclable pages into free space map and update index statistics. This patch allows btvacuumclanup to skip full index scan when two conditions are satisfied: no pages are going to be put into free space map and index statistics isn't stalled. In order to check first condition, we store oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then there are some recyclable pages. In order to check second condition we store number of heap tuples observed during previous full index scan by cleanup. If fraction of newly inserted tuples is less than vacuum_cleanup_index_scale_factor, then statistics isn't considered to be stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default). This patch bumps B-tree meta-page version. Upgrade of meta-page is performed "on the fly": during VACUUM meta-page is rewritten with new version. No special handling in pg_upgrade is required. Author: Masahiko Sawada, Alexander Korotkov Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/857f9c36cda520030381bd8c2af20adf0ce0e1d4 Modified Files -------------- contrib/amcheck/verify_nbtree.c | 8 +- contrib/pageinspect/Makefile | 3 +- contrib/pageinspect/btreefuncs.c | 4 +- contrib/pageinspect/expected/btree.out | 16 +-- contrib/pageinspect/pageinspect--1.6--1.7.sql | 26 +++++ contrib/pageinspect/pageinspect.control | 2 +- contrib/pgstattuple/expected/pgstattuple.out | 10 +- doc/src/sgml/config.sgml | 25 +++++ doc/src/sgml/pageinspect.sgml | 16 +-- doc/src/sgml/ref/create_index.sgml | 15 +++ src/backend/access/common/reloptions.c | 13 ++- src/backend/access/nbtree/nbtinsert.c | 12 +++ src/backend/access/nbtree/nbtpage.c | 150 ++++++++++++++++++++++++-- src/backend/access/nbtree/nbtree.c | 118 ++++++++++++++++++-- src/backend/access/nbtree/nbtxlog.c | 6 +- src/backend/utils/init/globals.c | 2 + src/backend/utils/misc/guc.c | 10 ++ src/include/access/nbtree.h | 11 +- src/include/access/nbtxlog.h | 4 + src/include/miscadmin.h | 2 + src/include/utils/rel.h | 2 + src/test/regress/expected/btree_index.out | 29 +++++ src/test/regress/sql/btree_index.sql | 19 ++++ 23 files changed, 458 insertions(+), 45 deletions(-)
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        Hi!
			
		On Wed, Apr 4, 2018 at 7:29 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Skip full index scan during cleanup of B-tree indexes when possible
Thank you for committing this.
It appears that patch contains some redundant variabled.  See warnings produced
by gcc-7.
nbtpage.c: In function '_bt_update_meta_cleanup_info'
nbtpage.c:121:15: warning: variable 'metaopaque' set but not used [-Wunused-but-set-variable]
  BTPageOpaque metaopaque;
               ^~~~~~~~~~
nbtree.c: In function '_bt_vacuum_needs_cleanup':
nbtree.c:790:15: warning: variable 'metaopaque' set but not used [-Wunused-but-set-variable]
  BTPageOpaque metaopaque;
               ^~~~~~~~~~
Attached patch fixes this.
------
Alexander Korotkov
Postgres Professional: http://www.
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.
The Russian Postgres Company
Вложения
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Peter Geoghegan
		    Дата:
		        On Wed, Apr 4, 2018 at 3:32 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Hi!
>
> On Wed, Apr 4, 2018 at 7:29 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>>
>> Skip full index scan during cleanup of B-tree indexes when possible
>
>
> Thank you for committing this.
>
> It appears that patch contains some redundant variabled.  See warnings
> produced
> by gcc-7.
I also see an assertion failure within _bt_getrootheight():
TRAP: FailedAssertion("!(metad->btm_version == 3)", File:
"/home/pg/postgresql/root/build/../source/src/backend/access/nbtree/nbtpage.c",
Line: 619)
-- 
Peter Geoghegan
			
		Peter Geoghegan <pg@bowt.ie> writes:
> I also see an assertion failure within _bt_getrootheight():
> TRAP: FailedAssertion("!(metad->btm_version == 3)", File:
> "/home/pg/postgresql/root/build/../source/src/backend/access/nbtree/nbtpage.c",
> Line: 619)
Hm, buildfarm's not complaining --- what's the test case?
            regards, tom lane
			
		On Wed, Apr 04, 2018 at 08:58:14PM -0400, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > I also see an assertion failure within _bt_getrootheight():
>
> > TRAP: FailedAssertion("!(metad->btm_version == 3)", File:
> > "/home/pg/postgresql/root/build/../source/src/backend/access/nbtree/nbtpage.c",
> > Line: 619)
>
> Hm, buildfarm's not complaining --- what's the test case?
Hm.  No problems here either with a56e267 and gcc 7.3.  The warnings are
here for sure, and any compiler would complain about those.
--
Michael
			
		Вложения
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Peter Geoghegan
		    Дата:
		        On Wed, Apr 4, 2018 at 5:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> TRAP: FailedAssertion("!(metad->btm_version == 3)", File:
>> "/home/pg/postgresql/root/build/../source/src/backend/access/nbtree/nbtpage.c",
>> Line: 619)
>
> Hm, buildfarm's not complaining --- what's the test case?
This was discovered while testing/reviewing the latest version of the
INCLUDE covering indexes patch. It now seems to be unrelated.
Sorry for the noise.
-- 
Peter Geoghegan
			
		Peter Geoghegan <pg@bowt.ie> writes:
>>> TRAP: FailedAssertion("!(metad->btm_version == 3)", File:
>>> "/home/pg/postgresql/root/build/../source/src/backend/access/nbtree/nbtpage.c",
>>> Line: 619)
>> Hm, buildfarm's not complaining --- what's the test case?
> This was discovered while testing/reviewing the latest version of the
> INCLUDE covering indexes patch. It now seems to be unrelated.
Oh, wait ... I wonder if you saw that because you were running a new
backend without having re-initdb'd?  Once you had re-initdb'd, then
of course there would be no old-format btree indexes anywhere.  But
if you hadn't, then anyplace that was not prepared to cope with the
old header format would complain about pre-existing indexes.
In short, this sounds like a place that did not get the memo about
how to cope with un-upgraded indexes.
            regards, tom lane
			
		Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Peter Geoghegan
		    Дата:
		        On Wed, Apr 4, 2018 at 8:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This was discovered while testing/reviewing the latest version of the >> INCLUDE covering indexes patch. It now seems to be unrelated. > > Oh, wait ... I wonder if you saw that because you were running a new > backend without having re-initdb'd? Yes. That's what happened. > Once you had re-initdb'd, then > of course there would be no old-format btree indexes anywhere. But > if you hadn't, then anyplace that was not prepared to cope with the > old header format would complain about pre-existing indexes. > > In short, this sounds like a place that did not get the memo about > how to cope with un-upgraded indexes. Sounds plausible. -- Peter Geoghegan
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Thu, Apr 5, 2018 at 1:32 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Wed, Apr 4, 2018 at 7:29 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:Skip full index scan during cleanup of B-tree indexes when possibleThank you for committing this.It appears that patch contains some redundant variabled. See warnings producedby gcc-7.nbtpage.c: In function '_bt_update_meta_cleanup_info': nbtpage.c:121:15: warning: variable 'metaopaque' set but not used [-Wunused-but-set-variable]BTPageOpaque metaopaque;^~~~~~~~~~nbtree.c: In function '_bt_vacuum_needs_cleanup':nbtree.c:790:15: warning: variable 'metaopaque' set but not used [-Wunused-but-set-variable]BTPageOpaque metaopaque;^~~~~~~~~~Attached patch fixes this.
Teodor already committed [1] better patch [2] from Kyotaro Horiguchi.  This question is closed.
2. https://www.postgresql.org/
------
Alexander Korotkov
Postgres Professional: http://www.postg
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postg
The Russian Postgres Company
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Thu, Apr 5, 2018 at 6:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@bowt.ie> writes:
>>> TRAP: FailedAssertion("!(metad->btm_version == 3)", File: 
>>> "/home/pg/postgresql/root/build/../source/src/backend/ access/nbtree/nbtpage.c", 
>>> Line: 619)
>> Hm, buildfarm's not complaining --- what's the test case?
> This was discovered while testing/reviewing the latest version of the
> INCLUDE covering indexes patch. It now seems to be unrelated.
Oh, wait ... I wonder if you saw that because you were running a new
backend without having re-initdb'd? Once you had re-initdb'd, then
of course there would be no old-format btree indexes anywhere. But
if you hadn't, then anyplace that was not prepared to cope with the
old header format would complain about pre-existing indexes.
In short, this sounds like a place that did not get the memo about
how to cope with un-upgraded indexes.
That's an issue, because meta-page should be upgraded "on the fly".
That was tested, but perhaps without assertions.  I'll investigate more on
this an propose a fix.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Thu, Apr 5, 2018 at 2:26 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Thu, Apr 5, 2018 at 6:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Peter Geoghegan <pg@bowt.ie> writes:
>>> TRAP: FailedAssertion("!(metad->btm_version == 3)", File: 
>>> "/home/pg/postgresql/root/build/../source/src/backend/access /nbtree/nbtpage.c", 
>>> Line: 619)
>> Hm, buildfarm's not complaining --- what's the test case?
> This was discovered while testing/reviewing the latest version of the
> INCLUDE covering indexes patch. It now seems to be unrelated.
Oh, wait ... I wonder if you saw that because you were running a new
backend without having re-initdb'd? Once you had re-initdb'd, then
of course there would be no old-format btree indexes anywhere. But
if you hadn't, then anyplace that was not prepared to cope with the
old header format would complain about pre-existing indexes.
In short, this sounds like a place that did not get the memo about
how to cope with un-upgraded indexes.That's an issue, because meta-page should be upgraded "on the fly".That was tested, but perhaps without assertions. I'll investigate more onthis an propose a fix.
So, "Assert(metad->btm_version == BTREE_VERSION)" was failed, because
metadata was copied from meta page to rel->rd_amcache "as is" with
metad->btm_version == BTREE_MIN_VERSION.
Without assert everything works fine, because extended metadata fields are
never used from rel->rd_amcache.  So my first intention was to relax this
assert to Assert(metad->btm_version >= BTREE_MIN_VERSION && metad->btm_version <= BTREE_VERSION).
But then I still have concern that we copy metadata beyond pd_lower
when metapage is in old format.
memcpy(rel->rd_amcache, metad, sizeof(BTMetaPageData));
This is why I decided to write separate function to handle caching of metadata,
which takes care about filling unavailable fields of metadata with default values.
I also made same fix for pageinspect.  Patch is attached.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Thu, Apr 5, 2018 at 3:24 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Thu, Apr 5, 2018 at 2:26 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:On Thu, Apr 5, 2018 at 6:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Peter Geoghegan <pg@bowt.ie> writes:
>>> TRAP: FailedAssertion("!(metad->btm_version == 3)", File: 
>>> "/home/pg/postgresql/root/build/../source/src/backend/access /nbtree/nbtpage.c", 
>>> Line: 619)
>> Hm, buildfarm's not complaining --- what's the test case?
> This was discovered while testing/reviewing the latest version of the
> INCLUDE covering indexes patch. It now seems to be unrelated.
Oh, wait ... I wonder if you saw that because you were running a new
backend without having re-initdb'd? Once you had re-initdb'd, then
of course there would be no old-format btree indexes anywhere. But
if you hadn't, then anyplace that was not prepared to cope with the
old header format would complain about pre-existing indexes.
In short, this sounds like a place that did not get the memo about
how to cope with un-upgraded indexes.That's an issue, because meta-page should be upgraded "on the fly".That was tested, but perhaps without assertions. I'll investigate more onthis an propose a fix.So, "Assert(metad->btm_version == BTREE_VERSION)" was failed, becausemetadata was copied from meta page to rel->rd_amcache "as is" withmetad->btm_version == BTREE_MIN_VERSION.Without assert everything works fine, because extended metadata fields arenever used from rel->rd_amcache. So my first intention was to relax thisassert to Assert(metad->btm_version >= BTREE_MIN_VERSION && metad->btm_version <= BTREE_VERSION).But then I still have concern that we copy metadata beyond pd_lowerwhen metapage is in old format.memcpy(rel->rd_amcache, metad, sizeof(BTMetaPageData));This is why I decided to write separate function to handle caching of metadata,which takes care about filling unavailable fields of metadata with default values.I also made same fix for pageinspect. Patch is attached.
I forgot to note, that I've tested this patch in following way.  I did initdb and
created some tables and indexes on eac93e20af.  And then used the
same cluster on patched master including index scans, some inserts/updates/deletes,
bt_metap(), vacuum and so on.  Everything worked fine.
Sorry, that I didn't test that enough before.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Thu, Apr 5, 2018 at 1:29 AM, Teodor Sigaev <teodor@sigaev.ru> wrote: > Skip full index scan during cleanup of B-tree indexes when possible This commit added XLOG_BTREE_META_CLEANUP, so btree_desc() and btree_identify() should be updated so that they handle XLOG_BTREE_META_CLEANUP. But ISTM that you forgot doing that. Regards, -- Fujii Masao
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Peter Geoghegan
		    Дата:
		        On Wed, Apr 18, 2018 at 11:12 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Thu, Apr 5, 2018 at 1:29 AM, Teodor Sigaev <teodor@sigaev.ru> wrote: >> Skip full index scan during cleanup of B-tree indexes when possible > > This commit added XLOG_BTREE_META_CLEANUP, so btree_desc() and > btree_identify() should be updated so that they handle XLOG_BTREE_META_CLEANUP. > But ISTM that you forgot doing that. Another thing that I noticed is that the metapage stores btm_last_cleanup_num_heap_tuples as a float4, even though xl_btree_metadata stores it as a double. I suggest that both places store it as float8, to be consistent. (It should not be double because we always avoid using anything other types with explicit typedef'd widths in WAL records.) -- Peter Geoghegan
Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Wed, Apr 18, 2018 at 9:21 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Apr 18, 2018 at 11:12 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Apr 5, 2018 at 1:29 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>> Skip full index scan during cleanup of B-tree indexes when possible
>
> This commit added XLOG_BTREE_META_CLEANUP, so btree_desc() and
> btree_identify() should be updated so that they handle XLOG_BTREE_META_CLEANUP.
> But ISTM that you forgot doing that.
Thank you for notice.  See attached bt-vacuum-cleanup-wal-record-desc-identify.patch fixing that.
Another thing that I noticed is that the metapage stores
btm_last_cleanup_num_heap_tuples as a float4, even though 
xl_btree_metadata stores it as a double. I suggest that both places
store it as float8, to be consistent. (It should not be double because
we always avoid using anything other types with explicit typedef'd
widths in WAL records.)
Good catch, thank you!  I also agree that both these fields should be of float8 type.
Please, find attached bt-vacuum-cleanup-float8-num-heap-tuples.patch fixing that.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения
>     Another thing that I noticed is that the metapage stores
>     btm_last_cleanup_num_heap_tuples as a float4, even though
>     xl_btree_metadata stores it as a double. I suggest that both places
>     store it as float8, to be consistent. (It should not be double because
>     we always avoid using anything other types with explicit typedef'd
>     widths in WAL records.)
> 
> 
> Good catch, thank you!  I also agree that both these fields should be of 
> float8 type.
> Please, find attached bt-vacuum-cleanup-float8-num-heap-tuples.patch 
> fixing that.
Hm. patch changes on-disk representation of btree meta page. And there 
is no support to upgrade meta page since 857f9c36 commit (and should 
not, because that versions wasn't relesead), to prevent possible false 
positive bug reports I suggest to bump catversion. Objections?
-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/
			
		Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi
От
 
		    	Alexander Korotkov
		    Дата:
		        On Thu, Apr 19, 2018 at 9:19 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Another thing that I noticed is that the metapage stores
btm_last_cleanup_num_heap_tuples as a float4, even though 
xl_btree_metadata stores it as a double. I suggest that both places
store it as float8, to be consistent. (It should not be double because
we always avoid using anything other types with explicit typedef'd
widths in WAL records.)
Good catch, thank you! I also agree that both these fields should be of float8 type.
Please, find attached bt-vacuum-cleanup-float8-num-heap-tuples.patch fixing that. 
Hm. patch changes on-disk representation of btree meta page. And there is no support to upgrade meta page since 857f9c36 commit (and should not, because that versions wasn't relesead), to prevent possible false positive bug reports I suggest to bump catversion. Objections?
I agree, catversion should be bumped for this patch.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Thank you, both patches are pushed. Alexander Korotkov wrote: > On Wed, Apr 18, 2018 at 9:21 PM, Peter Geoghegan <pg@bowt.ie > <mailto:pg@bowt.ie>> wrote: > > On Wed, Apr 18, 2018 at 11:12 AM, Fujii Masao <masao.fujii@gmail.com > <mailto:masao.fujii@gmail.com>> wrote: > > On Thu, Apr 5, 2018 at 1:29 AM, Teodor Sigaev <teodor@sigaev.ru <mailto:teodor@sigaev.ru>> wrote: > >> Skip full index scan during cleanup of B-tree indexes when possible > > > > This commit added XLOG_BTREE_META_CLEANUP, so btree_desc() and > > btree_identify() should be updated so that they handle XLOG_BTREE_META_CLEANUP. > > But ISTM that you forgot doing that. > > Thank you for notice. See attached > bt-vacuum-cleanup-wal-record-desc-identify.patch fixing that. > > Another thing that I noticed is that the metapage stores > btm_last_cleanup_num_heap_tuples as a float4, even though > xl_btree_metadata stores it as a double. I suggest that both places > store it as float8, to be consistent. (It should not be double because > we always avoid using anything other types with explicit typedef'd > widths in WAL records.) > > > Good catch, thank you! I also agree that both these fields should be of float8 > type. > Please, find attached bt-vacuum-cleanup-float8-num-heap-tuples.patch fixing that. > > ------ > Alexander Korotkov > Postgres Professional:http://www.postgrespro.com <http://www.postgrespro.com/> > The Russian Postgres Company > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/