Обсуждение: Fix gistkillitems & add regression test to microvacuum

Поиск
Список
Период
Сортировка

Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
Hi hackers.

While looking at [0] I noticed that XLOG_GIST_DELETE &  XLOG_GIST_PAGE_DELETE
records are not covered.

This thread addresses XLOG_GIST_DELETE, which is also known as a
microvacuum feature.

test.sql contains regression test that trigger this code to be
exercised in stream_regress.pl TAP test.

Test is as follows: we create a gist index on the table, then we
insert exactly 407 records, making the root page full (next insert
will trigger page split). Then I delete all tuples from relation and
trigger Index Only scan to do kill-on-select (killtuples). It marks
gist 0 page (which is root and is leaf) as has_garbage. Then, the next
insertion triggers xlog_gist_delete record.

To verify this I use pageinspect and pg_waldimp (locally). Also this
test is dependent on block size being 8192 which is not good.


And all of this does not work actually without v1-0001, because there
is a bug in GiST which does not call gistkillitmes for the very first
(root) page.

There is also test2.sql which inserts a single tuple, not 407. It can
be used to verify v1-0001.

[0] coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html


-- 
Best regards,
Kirill Reshke

Вложения

Re: Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
On Thu, 15 Jan 2026 at 12:00, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Hi hackers.
>
> While looking at [0] I noticed that XLOG_GIST_DELETE &  XLOG_GIST_PAGE_DELETE
> records are not covered.
>
> This thread addresses XLOG_GIST_DELETE, which is also known as a
> microvacuum feature.
>
> test.sql contains regression test that trigger this code to be
> exercised in stream_regress.pl TAP test.
>
> Test is as follows: we create a gist index on the table, then we
> insert exactly 407 records, making the root page full (next insert
> will trigger page split). Then I delete all tuples from relation and
> trigger Index Only scan to do kill-on-select (killtuples). It marks
> gist 0 page (which is root and is leaf) as has_garbage. Then, the next
> insertion triggers xlog_gist_delete record.
>
> To verify this I use pageinspect and pg_waldimp (locally). Also this
> test is dependent on block size being 8192 which is not good.
>
>
> And all of this does not work actually without v1-0001, because there
> is a bug in GiST which does not call gistkillitmes for the very first
> (root) page.
>
> There is also test2.sql which inserts a single tuple, not 407. It can
> be used to verify v1-0001.
>
> [0] coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html
>
>
> --
> Best regards,
> Kirill Reshke


From cf feedback it turns out we already have an isolation test for
this, and it does almost exactly the same.
And more, it fails.
Will try to fix


-- 
Best regards,
Kirill Reshke



Re: Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
On Thu, 15 Jan 2026 at 12:46, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> On Thu, 15 Jan 2026 at 12:00, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > Hi hackers.
> >
> > While looking at [0] I noticed that XLOG_GIST_DELETE &  XLOG_GIST_PAGE_DELETE
> > records are not covered.
> >
> > This thread addresses XLOG_GIST_DELETE, which is also known as a
> > microvacuum feature.
> >
> > test.sql contains regression test that trigger this code to be
> > exercised in stream_regress.pl TAP test.
> >
> > Test is as follows: we create a gist index on the table, then we
> > insert exactly 407 records, making the root page full (next insert
> > will trigger page split). Then I delete all tuples from relation and
> > trigger Index Only scan to do kill-on-select (killtuples). It marks
> > gist 0 page (which is root and is leaf) as has_garbage. Then, the next
> > insertion triggers xlog_gist_delete record.
> >
> > To verify this I use pageinspect and pg_waldimp (locally). Also this
> > test is dependent on block size being 8192 which is not good.
> >
> >
> > And all of this does not work actually without v1-0001, because there
> > is a bug in GiST which does not call gistkillitmes for the very first
> > (root) page.
> >
> > There is also test2.sql which inserts a single tuple, not 407. It can
> > be used to verify v1-0001.
> >
> > [0] coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html
> >
> >
> > --
> > Best regards,
> > Kirill Reshke
>
>
> From cf feedback it turns out we already have an isolation test for
> this, and it does almost exactly the same.
> And more, it fails.
> Will try to fix
>
>
> --
> Best regards,
> Kirill Reshke

This looks like gist does not work for small  indexes and this is
explicitly tested after [0]
[0] https://www.postgresql.org/message-id/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6%40lzx3sjsqhmt6


-- 
Best regards,
Kirill Reshke



Re: Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
On Thu, 15 Jan 2026 at 13:21, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> On Thu, 15 Jan 2026 at 12:46, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > On Thu, 15 Jan 2026 at 12:00, Kirill Reshke <reshkekirill@gmail.com> wrote:
> > >
> > > Hi hackers.
> > >
> > > While looking at [0] I noticed that XLOG_GIST_DELETE &  XLOG_GIST_PAGE_DELETE
> > > records are not covered.
> > >
> > > This thread addresses XLOG_GIST_DELETE, which is also known as a
> > > microvacuum feature.
> > >
> > > test.sql contains regression test that trigger this code to be
> > > exercised in stream_regress.pl TAP test.
> > >
> > > Test is as follows: we create a gist index on the table, then we
> > > insert exactly 407 records, making the root page full (next insert
> > > will trigger page split). Then I delete all tuples from relation and
> > > trigger Index Only scan to do kill-on-select (killtuples). It marks
> > > gist 0 page (which is root and is leaf) as has_garbage. Then, the next
> > > insertion triggers xlog_gist_delete record.
> > >
> > > To verify this I use pageinspect and pg_waldimp (locally). Also this
> > > test is dependent on block size being 8192 which is not good.
> > >
> > >
> > > And all of this does not work actually without v1-0001, because there
> > > is a bug in GiST which does not call gistkillitmes for the very first
> > > (root) page.
> > >
> > > There is also test2.sql which inserts a single tuple, not 407. It can
> > > be used to verify v1-0001.
> > >
> > > [0] coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html
> > >
> > >
> > > --
> > > Best regards,
> > > Kirill Reshke
> >
> >
> > From cf feedback it turns out we already have an isolation test for
> > this, and it does almost exactly the same.
> > And more, it fails.
> > Will try to fix
> >
> >
> > --
> > Best regards,
> > Kirill Reshke
>
> This looks like gist does not work for small  indexes and this is
> explicitly tested after [0]
> [0] https://www.postgresql.org/message-id/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6%40lzx3sjsqhmt6
>
>
> --
> Best regards,
> Kirill Reshke

I was right on commit message of 377b7ab

"""
For gist some related paths were reached, but gist's implementation
seems to not work if all the dead tuples are on one page (or something
like that). The coverage for other index types was rather incidental.
"""

It does not work  if all the dead tuples are on one page and this page is ROOT.

So, should we delete this

...
# Test gist, but with fewer rows - shows that killitems doesn't work anymore!
permutation
  create_table fill_10 create_ext_btree_gist create_gist flush
  disable_seq disable_bitmap

...

 from isolation spec?


-- 
Best regards,
Kirill Reshke



Re: Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
On Thu, 15 Jan 2026 at 13:35, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> On Thu, 15 Jan 2026 at 13:21, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > On Thu, 15 Jan 2026 at 12:46, Kirill Reshke <reshkekirill@gmail.com> wrote:
> > >
> > > On Thu, 15 Jan 2026 at 12:00, Kirill Reshke <reshkekirill@gmail.com> wrote:
> > > >
> > > > Hi hackers.
> > > >
> > > > While looking at [0] I noticed that XLOG_GIST_DELETE &  XLOG_GIST_PAGE_DELETE
> > > > records are not covered.
> > > >
> > > > This thread addresses XLOG_GIST_DELETE, which is also known as a
> > > > microvacuum feature.
> > > >
> > > > test.sql contains regression test that trigger this code to be
> > > > exercised in stream_regress.pl TAP test.
> > > >
> > > > Test is as follows: we create a gist index on the table, then we
> > > > insert exactly 407 records, making the root page full (next insert
> > > > will trigger page split). Then I delete all tuples from relation and
> > > > trigger Index Only scan to do kill-on-select (killtuples). It marks
> > > > gist 0 page (which is root and is leaf) as has_garbage. Then, the next
> > > > insertion triggers xlog_gist_delete record.
> > > >
> > > > To verify this I use pageinspect and pg_waldimp (locally). Also this
> > > > test is dependent on block size being 8192 which is not good.
> > > >
> > > >
> > > > And all of this does not work actually without v1-0001, because there
> > > > is a bug in GiST which does not call gistkillitmes for the very first
> > > > (root) page.
> > > >
> > > > There is also test2.sql which inserts a single tuple, not 407. It can
> > > > be used to verify v1-0001.
> > > >
> > > > [0] coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Kirill Reshke
> > >
> > >
> > > From cf feedback it turns out we already have an isolation test for
> > > this, and it does almost exactly the same.
> > > And more, it fails.
> > > Will try to fix
> > >
> > >
> > > --
> > > Best regards,
> > > Kirill Reshke
> >
> > This looks like gist does not work for small  indexes and this is
> > explicitly tested after [0]
> > [0] https://www.postgresql.org/message-id/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6%40lzx3sjsqhmt6
> >
> >
> > --
> > Best regards,
> > Kirill Reshke
>
> I was right on commit message of 377b7ab
>
> """
> For gist some related paths were reached, but gist's implementation
> seems to not work if all the dead tuples are on one page (or something
> like that). The coverage for other index types was rather incidental.
> """
>
> It does not work  if all the dead tuples are on one page and this page is ROOT.
>
> So, should we delete this
>
> ...
> # Test gist, but with fewer rows - shows that killitems doesn't work anymore!
> permutation
>   create_table fill_10 create_ext_btree_gist create_gist flush
>   disable_seq disable_bitmap
>
> ...
>
>  from isolation spec?
>
>
> --
> Best regards,
> Kirill Reshke

PFA v2 which leaves the test in-place.

Also commit message improvements.

-- 
Best regards,
Kirill Reshke

Вложения

Re: Fix gistkillitems & add regression test to microvacuum

От
Andrey Borodin
Дата:

> On 15 Jan 2026, at 22:59, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> PFA v2 which leaves the test in-place.
>
> Also commit message improvements.

Yeah, killtuples for GiST root page is broken. Your patch is fixing it.
I don't think we should backpatch this, the bug is harmless, but for master the patch LGTM.
It would be good to assign so->curBlkno and so->curBlkno together. But gistScanPage() is the only place with access to
theblock number. 

+# Test gist, but with fewer rows - that killitems used to be buggy.

Probably, in this comment we can explicitly say that killitems was buggy, but now is fixed.

Thanks!


Best regards, Andrey Borodin.


Re: Fix gistkillitems & add regression test to microvacuum

От
Kirill Reshke
Дата:
On Tue, 20 Jan 2026 at 15:30, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 15 Jan 2026, at 22:59, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > PFA v2 which leaves the test in-place.
> >
> > Also commit message improvements.
>
> Yeah, killtuples for GiST root page is broken. Your patch is fixing it.
> I don't think we should backpatch this, the bug is harmless, but for master the patch LGTM.

Thank you

> It would be good to assign so->curBlkno and so->curBlkno together. But gistScanPage() is the only place with access
tothe block number.
 

Sorry, didnt get this take.

> +# Test gist, but with fewer rows - that killitems used to be buggy.
>
> Probably, in this comment we can explicitly say that killitems was buggy, but now is fixed.
>

Hmm, what would be a good wording here?


-- 
Best regards,
Kirill Reshke



Re: Fix gistkillitems & add regression test to microvacuum

От
Andrey Borodin
Дата:

> On 23 Jan 2026, at 16:03, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> On Tue, 20 Jan 2026 at 15:30, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>>
>>
>>
>>> On 15 Jan 2026, at 22:59, Kirill Reshke <reshkekirill@gmail.com> wrote:
>>>
>>> PFA v2 which leaves the test in-place.
>>>
>>> Also commit message improvements.
>>
>> Yeah, killtuples for GiST root page is broken. Your patch is fixing it.
>> I don't think we should backpatch this, the bug is harmless, but for master the patch LGTM.
>
> Thank you
>
>> It would be good to assign so->curBlkno and so->curBlkno together. But gistScanPage() is the only place with access
tothe block number. 
>
> Sorry, didnt get this take.

Sorry, I meant so->curBlkno and so->numKilled are semantically correlated. But it's difficult to assign them together
andthis does not worth refactoring. 

>
>> +# Test gist, but with fewer rows - that killitems used to be buggy.
>>
>> Probably, in this comment we can explicitly say that killitems was buggy, but now is fixed.
>>
>
> Hmm, what would be a good wording here?

I've opened an English dictionary and it says "used to" can be used with a meaning "bug was there but it's not
anymore".

So the patch is RfC IMO.


Best regards, Andrey Borodin.