Обсуждение: Re: amcheck: support for GiST

Поиск

Список

Период

Сортировка

Re: amcheck: support for GiST

От

Andrey Borodin

Дата:

11 июля 2025 г., 15:38:36

> On 30 Jun 2025, at 16:34, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> Please find attached two new steps for amcheck:
> 1. A function to verify GiST integrity. This patch is in decent shape, simply rebased from previous year.
> 2. Support on pg_amcheck's side for this function. This patch did not receive such review attention before. And,
perhaps,should be extended to support existing GIN functions. 

Here's a version that adds GIN functions to pg_amcheck.
IDK, maybe we should split pg_amcheck part into another thread and add there BRIN too...

Thanks!

Best regards, Andrey Borodin.

Вложения

Re: amcheck: support for GiST

От

Arseniy Mukhin

Дата:

22 июля 2025 г., 13:47:18

Hi, Andrey!

Thank you for working on this! There is a long history of the patch, I
hope it will be committed soon!)

On Fri, Jul 11, 2025 at 3:39 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 30 Jun 2025, at 16:34, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > Please find attached two new steps for amcheck:
> > 1. A function to verify GiST integrity. This patch is in decent shape, simply rebased from previous year.
> > 2. Support on pg_amcheck's side for this function. This patch did not receive such review attention before. And,
perhaps,should be extended to support existing GIN functions. 
>
> Here's a version that adds GIN functions to pg_amcheck.
> IDK, maybe we should split pg_amcheck part into another thread and add there BRIN too...
>

Speaking of BRIN pg_amcheck, I probably wouldn't merge it with the
gist/gin pg_amceck patchset because that would create a dependency on
the amcheck BRIN support patch, which is not clear when it will be
ready.


There are some points about the patch:

1) There are several typos in verify_gist.c:

 downlinks -> downlink (header comment)
 discrepencies -> discrepancies
 Correctess -> Correctness
 hande -> handle
 Initaliaze -> Initialize
 numbmer -> number
 replcaed -> replaced
 aquire -> aqcuire

2) Copyright year is 2023 in the patch. Time flies:)

3) There is the same check in btree and while reviewing the patch I
realised it should be added to the BRIN amcheck as well. Probably it
will be needed for GIN someday. What do you think about moving it to
verify_common?

    if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
       !TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
                         result->snapshot->xmin))

4) Should we check blknum of the new entry before pushing to the
stack? Probably we can check if it's a valid blknum and it's not
outside of the index. This way we can give a more detailed error
message in case we meet the wrong blknum.

    in the split detection code:

        ptr->blkno = GistPageGetOpaque(page)->rightlink;

    and when we add children of an inner page:

        ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));

5) There is a macros for 0xffff - 'TUPLE_IS_VALID'. Maybe we can use
it to make the code more readable? Also the error message contains one
extra 'has'.

          if (off != 0xffff)
             ereport(ERROR,
                   (errcode(ERRCODE_INDEX_CORRUPTED),
                    errmsg("index \"%s\" has on page %u offset %u has
item id not pointing to 0xffff, but %hu",
                         RelationGetRelationName(check_state->rel),
stack->blkno, i, off)));

6) Several points about 'gistFormNormalizedTuple'. I read the previous
thread [1] and it seems there is an unfinished discussion about
normalizing during heapallindexed check. I think normalization needs
some more work here.

6a) There is a TODO
          /* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */

6b) AFAICS 'compress' is an optional support function. If opclass
doesn't have a 'compress' function, then 'gistCompressValues' leaves
such attributes as it is. Here we get attdata from the heap scan, and
it could be toasted. That means that these checks can result in false
positives:

    gistCompressValues(giststate, r, attdata, isnull, true, compatt);
    ...

    for (int i = 0; i < r->rd_att->natts; i++)
    {
        if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
           ereport(ERROR,
                 (errcode(ERRCODE_INDEX_CORRUPTED),
                 ....
        if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
        {
           if (i < IndexRelationGetNumberOfKeyAttributes(r))
              ereport(ERROR,

    Also 'VARATT_IS_EXTERNAL' check will always result in a false
positive for toasted include attributes here. Reproducer for it:

    DROP TABLE IF EXISTS tbl;
    CREATE TABLE tbl(a point, t text);
    -- disable compression for 't', but let it to be external
    ALTER TABLE tbl ALTER COLUMN t SET STORAGE external ;
    INSERT INTO tbl  values (point(random(), random()), repeat('a',3000 ));
    CREATE INDEX tbl_idx ON tbl using gist (a) include (t);
    SELECT gist_index_check('tbl_idx', true);

    So I think we need to remove these checks completely.

6c) Current code doesn't apply normalization for existing index tuples
during adding to bloom filter, which can result in false positive,
reproducer:
    Here we use plain storage during index build, then during check we
have extended storage, which results in different binary
representation of the same data and we have false positive here.

    DROP TABLE IF EXISTS tbl;
    CREATE TABLE tbl(a tsvector);
    CREATE INDEX tbl_idx ON tbl using gist (a) ;
    ALTER TABLE tbl ALTER COLUMN a SET STORAGE plain;
    INSERT INTO tbl  values ('a' ::tsvector);
    ALTER TABLE tbl ALTER COLUMN a SET STORAGE extended ;
    SELECT gist_index_check('tbl_idx', true);

6d) In the end of 'gistFormNormalizedTuple' we have

        ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);

    I guess it follows gistFormTuple function, but here we use
gistFormNormalizedTuple only for leaf tuples and we override
offsetnumber right after 'gistFormNormalizedTuple' function call, so
looks like we can drop it.


    In general I think normalization here can follow the same logic as
for verify_nbtree. We can even reuse 'bt_normalize_tuple' as a
normalization function. It handles all corner cases like short varatt,
differences in compressions etc, that we can have in gist as well. It
contains just a few lines about btree and everything else valid for
gist, so we need to modify it a bit. I think we can move it to
verify_common. Then we need to normalize every existing leaf index
tuple before adding it to the bloom filter. During the probing phase I
think we just can use 'gistFormTuple' to build an index tuple and then
normalize it before probing. What do you think?

Thank you!

[1] https://www.postgresql.org/message-id/CAAhFRxiHCWe_6AmqGWZqYEkgN_uQG3Jgw0WgPw%2B0zO3_D-q4DA%40mail.gmail.com


Best regards,
Arseniy Mukhin

Re: amcheck: support for GiST

От

Kirill Reshke

Дата:

22 октября 2025 г., 21:57:43

Hi! Thank you for your review.
Im posting new version of 0001 patch of series

On Tue, 22 Jul 2025 at 15:47, Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:
>
> Hi, Andrey!
>
> Thank you for working on this! There is a long history of the patch, I
> hope it will be committed soon!)
>
> On Fri, Jul 11, 2025 at 3:39 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> >
> >
> >
> > > On 30 Jun 2025, at 16:34, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > > Please find attached two new steps for amcheck:
> > > 1. A function to verify GiST integrity. This patch is in decent shape, simply rebased from previous year.
> > > 2. Support on pg_amcheck's side for this function. This patch did not receive such review attention before. And,
perhaps,should be extended to support existing GIN functions. 
> >
> > Here's a version that adds GIN functions to pg_amcheck.
> > IDK, maybe we should split pg_amcheck part into another thread and add there BRIN too...
> >
>
> Speaking of BRIN pg_amcheck, I probably wouldn't merge it with the
> gist/gin pg_amceck patchset because that would create a dependency on
> the amcheck BRIN support patch, which is not clear when it will be
> ready.
>
>
> There are some points about the patch:
>
> 1) There are several typos in verify_gist.c:
>
>  downlinks -> downlink (header comment)
>  discrepencies -> discrepancies
>  Correctess -> Correctness
>  hande -> handle
>  Initaliaze -> Initialize
>  numbmer -> number
>  replcaed -> replaced
>  aquire -> aqcuire
>
> 2) Copyright year is 2023 in the patch. Time flies:)

These two are (trivially) fixed.

> 3) There is the same check in btree and while reviewing the patch I
> realised it should be added to the BRIN amcheck as well. Probably it
> will be needed for GIN someday. What do you think about moving it to
> verify_common?
>
>     if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
>        !TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
>                          result->snapshot->xmin))

I think this is a good idea. I'm not sure if we should bother with
refactoring in this series though...

> 4) Should we check blknum of the new entry before pushing to the
> stack? Probably we can check if it's a valid blknum and it's not
> outside of the index. This way we can give a more detailed error
> message in case we meet the wrong blknum.
>
>     in the split detection code:
>
>         ptr->blkno = GistPageGetOpaque(page)->rightlink;
>
>     and when we add children of an inner page:
>
>         ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));

We indeed need to recheck all bytes in possibly-corrupted indexes,
including downlinks.
But amcheck can be run concurrently with Index insert, which will
change the current index size, so checking is not trivial.
And given it is not checked in already-committed nbtree & GIN amcheck
modules, I suggest not bother with that in this thread.
This can be a separate patch to verify_nbtree.

> 5) There is a macros for 0xffff - 'TUPLE_IS_VALID'. Maybe we can use
> it to make the code more readable? Also the error message contains one
> extra 'has'.
>
>           if (off != 0xffff)
>              ereport(ERROR,
>                    (errcode(ERRCODE_INDEX_CORRUPTED),
>                     errmsg("index \"%s\" has on page %u offset %u has
> item id not pointing to 0xffff, but %hu",
>                          RelationGetRelationName(check_state->rel),
> stack->blkno, i, off)));

Sure, I replaced all usages with  TUPLE_IS_VALID.

> 6) Several points about 'gistFormNormalizedTuple'. I read the previous
> thread [1] and it seems there is an unfinished discussion about
> normalizing during heapallindexed check. I think normalization needs
> some more work here.
>
> 6a) There is a TODO
>           /* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
>
> 6b) AFAICS 'compress' is an optional support function. If opclass
> doesn't have a 'compress' function, then 'gistCompressValues' leaves
> such attributes as it is. Here we get attdata from the heap scan, and
> it could be toasted. That means that these checks can result in false
> positives:
>
>     gistCompressValues(giststate, r, attdata, isnull, true, compatt);
>     ...
>
>     for (int i = 0; i < r->rd_att->natts; i++)
>     {
>         if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
>            ereport(ERROR,
>                  (errcode(ERRCODE_INDEX_CORRUPTED),
>                  ....
>         if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
>         {
>            if (i < IndexRelationGetNumberOfKeyAttributes(r))
>               ereport(ERROR,
>
>     Also 'VARATT_IS_EXTERNAL' check will always result in a false
> positive for toasted include attributes here. Reproducer for it:
>
>     DROP TABLE IF EXISTS tbl;
>     CREATE TABLE tbl(a point, t text);
>     -- disable compression for 't', but let it to be external
>     ALTER TABLE tbl ALTER COLUMN t SET STORAGE external ;
>     INSERT INTO tbl  values (point(random(), random()), repeat('a',3000 ));
>     CREATE INDEX tbl_idx ON tbl using gist (a) include (t);
>     SELECT gist_index_check('tbl_idx', true);
>
>     So I think we need to remove these checks completely.
>
> 6c) Current code doesn't apply normalization for existing index tuples
> during adding to bloom filter, which can result in false positive,
> reproducer:
>     Here we use plain storage during index build, then during check we
> have extended storage, which results in different binary
> representation of the same data and we have false positive here.
>
>     DROP TABLE IF EXISTS tbl;
>     CREATE TABLE tbl(a tsvector);
>     CREATE INDEX tbl_idx ON tbl using gist (a) ;
>     ALTER TABLE tbl ALTER COLUMN a SET STORAGE plain;
>     INSERT INTO tbl  values ('a' ::tsvector);
>     ALTER TABLE tbl ALTER COLUMN a SET STORAGE extended ;
>     SELECT gist_index_check('tbl_idx', true);
>
> 6d) In the end of 'gistFormNormalizedTuple' we have
>
>         ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
>
>     I guess it follows gistFormTuple function, but here we use
> gistFormNormalizedTuple only for leaf tuples and we override
> offsetnumber right after 'gistFormNormalizedTuple' function call, so
> looks like we can drop it.
>
>
>     In general I think normalization here can follow the same logic as
> for verify_nbtree. We can even reuse 'bt_normalize_tuple' as a
> normalization function. It handles all corner cases like short varatt,
> differences in compressions etc, that we can have in gist as well. It
> contains just a few lines about btree and everything else valid for
> gist, so we need to modify it a bit. I think we can move it to
> verify_common. Then we need to normalize every existing leaf index
> tuple before adding it to the bloom filter. During the probing phase I
> think we just can use 'gistFormTuple' to build an index tuple and then
> normalize it before probing. What do you think?
>
> Thank you!

I did a little refactor in patch one, so we can reuse
bt_normalize_tuple. With these changes, your reproducers do not
complain.
I guess `gistFormNormalizedTuple` is now unneeded and its comment no
longer true. index_form_tuple in gist_tuple_present_callback ensures
all attributes are detoasted, and then we do `amcheck_normalize_tuple`
call.
I will remove gistFormNormalizedTuple function in next iteration if
this approach is OK

>
> [1] https://www.postgresql.org/message-id/CAAhFRxiHCWe_6AmqGWZqYEkgN_uQG3Jgw0WgPw%2B0zO3_D-q4DA%40mail.gmail.com
>
>
> Best regards,
> Arseniy Mukhin
>
>


--
Best regards,
Kirill Reshke

Вложения

Re: amcheck: support for GiST

От

Arseniy Mukhin

Дата:

23 октября 2025 г., 14:39:56

Hi,

On Wed, Oct 22, 2025 at 9:57 PM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Hi! Thank you for your review.

Thank you for the new version!

> Im posting new version of 0001 patch of series
>
> On Tue, 22 Jul 2025 at 15:47, Arseniy Mukhin
> <arseniy.mukhin.dev@gmail.com> wrote:
> >
> > Hi, Andrey!
> >
> > Thank you for working on this! There is a long history of the patch, I
> > hope it will be committed soon!)
> >
> > On Fri, Jul 11, 2025 at 3:39 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > >
> > >
> > >
> > > > On 30 Jun 2025, at 16:34, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > > >
> > > > Please find attached two new steps for amcheck:
> > > > 1. A function to verify GiST integrity. This patch is in decent shape, simply rebased from previous year.
> > > > 2. Support on pg_amcheck's side for this function. This patch did not receive such review attention before.
And,perhaps, should be extended to support existing GIN functions. 
> > >
> > > Here's a version that adds GIN functions to pg_amcheck.
> > > IDK, maybe we should split pg_amcheck part into another thread and add there BRIN too...
> > >
> >
> > Speaking of BRIN pg_amcheck, I probably wouldn't merge it with the
> > gist/gin pg_amceck patchset because that would create a dependency on
> > the amcheck BRIN support patch, which is not clear when it will be
> > ready.
> >
> >
> > There are some points about the patch:
> >
> > 1) There are several typos in verify_gist.c:
> >
> >  downlinks -> downlink (header comment)
> >  discrepencies -> discrepancies
> >  Correctess -> Correctness
> >  hande -> handle
> >  Initaliaze -> Initialize
> >  numbmer -> number
> >  replcaed -> replaced
> >  aquire -> aqcuire
> >
> > 2) Copyright year is 2023 in the patch. Time flies:)
>
> These two are (trivially) fixed.
>
> > 3) There is the same check in btree and while reviewing the patch I
> > realised it should be added to the BRIN amcheck as well. Probably it
> > will be needed for GIN someday. What do you think about moving it to
> > verify_common?
> >
> >     if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
> >        !TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
> >                          result->snapshot->xmin))
>
> I think this is a good idea. I'm not sure if we should bother with
> refactoring in this series though...

Great, so maybe we can start a separate thread for this small
refactoring. Some work in this direction has been already done in brin
amcheck thread [0].

>
> > 4) Should we check blknum of the new entry before pushing to the
> > stack? Probably we can check if it's a valid blknum and it's not
> > outside of the index. This way we can give a more detailed error
> > message in case we meet the wrong blknum.
> >
> >     in the split detection code:
> >
> >         ptr->blkno = GistPageGetOpaque(page)->rightlink;
> >
> >     and when we add children of an inner page:
> >
> >         ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
>
> We indeed need to recheck all bytes in possibly-corrupted indexes,
> including downlinks.
> But amcheck can be run concurrently with Index insert, which will
> change the current index size, so checking is not trivial.
> And given it is not checked in already-committed nbtree & GIN amcheck
> modules, I suggest not bother with that in this thread.
> This can be a separate patch to verify_nbtree.
>

OK.

>
> > 6) Several points about 'gistFormNormalizedTuple'. I read the previous
> > thread [1] and it seems there is an unfinished discussion about
> > normalizing during heapallindexed check. I think normalization needs
> > some more work here.
> >
> > 6a) There is a TODO
> >           /* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
> >
> > 6b) AFAICS 'compress' is an optional support function. If opclass
> > doesn't have a 'compress' function, then 'gistCompressValues' leaves
> > such attributes as it is. Here we get attdata from the heap scan, and
> > it could be toasted. That means that these checks can result in false
> > positives:
> >
> >     gistCompressValues(giststate, r, attdata, isnull, true, compatt);
> >     ...
> >
> >     for (int i = 0; i < r->rd_att->natts; i++)
> >     {
> >         if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
> >            ereport(ERROR,
> >                  (errcode(ERRCODE_INDEX_CORRUPTED),
> >                  ....
> >         if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
> >         {
> >            if (i < IndexRelationGetNumberOfKeyAttributes(r))
> >               ereport(ERROR,
> >
> >     Also 'VARATT_IS_EXTERNAL' check will always result in a false
> > positive for toasted include attributes here. Reproducer for it:
> >
> >     DROP TABLE IF EXISTS tbl;
> >     CREATE TABLE tbl(a point, t text);
> >     -- disable compression for 't', but let it to be external
> >     ALTER TABLE tbl ALTER COLUMN t SET STORAGE external ;
> >     INSERT INTO tbl  values (point(random(), random()), repeat('a',3000 ));
> >     CREATE INDEX tbl_idx ON tbl using gist (a) include (t);
> >     SELECT gist_index_check('tbl_idx', true);
> >
> >     So I think we need to remove these checks completely.
> >
> > 6c) Current code doesn't apply normalization for existing index tuples
> > during adding to bloom filter, which can result in false positive,
> > reproducer:
> >     Here we use plain storage during index build, then during check we
> > have extended storage, which results in different binary
> > representation of the same data and we have false positive here.
> >
> >     DROP TABLE IF EXISTS tbl;
> >     CREATE TABLE tbl(a tsvector);
> >     CREATE INDEX tbl_idx ON tbl using gist (a) ;
> >     ALTER TABLE tbl ALTER COLUMN a SET STORAGE plain;
> >     INSERT INTO tbl  values ('a' ::tsvector);
> >     ALTER TABLE tbl ALTER COLUMN a SET STORAGE extended ;
> >     SELECT gist_index_check('tbl_idx', true);
> >
> > 6d) In the end of 'gistFormNormalizedTuple' we have
> >
> >         ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
> >
> >     I guess it follows gistFormTuple function, but here we use
> > gistFormNormalizedTuple only for leaf tuples and we override
> > offsetnumber right after 'gistFormNormalizedTuple' function call, so
> > looks like we can drop it.
> >
> >
> >     In general I think normalization here can follow the same logic as
> > for verify_nbtree. We can even reuse 'bt_normalize_tuple' as a
> > normalization function. It handles all corner cases like short varatt,
> > differences in compressions etc, that we can have in gist as well. It
> > contains just a few lines about btree and everything else valid for
> > gist, so we need to modify it a bit. I think we can move it to
> > verify_common. Then we need to normalize every existing leaf index
> > tuple before adding it to the bloom filter. During the probing phase I
> > think we just can use 'gistFormTuple' to build an index tuple and then
> > normalize it before probing. What do you think?
> >
> > Thank you!
>
> I did a little refactor in patch one, so we can reuse
> bt_normalize_tuple. With these changes, your reproducers do not
> complain.
> I guess `gistFormNormalizedTuple` is now unneeded and its comment no
> longer true. index_form_tuple in gist_tuple_present_callback ensures
> all attributes are detoasted, and then we do `amcheck_normalize_tuple`
> call.
> I will remove gistFormNormalizedTuple function in next iteration if
> this approach is OK
>

LGTM. Only one point here: I think maybe we need to move the
bt_normalize_tuple comment as well, as now it is more about
amcheck_normalize_tuple than bt_normalize_tuple.


[0] https://www.postgresql.org/message-id/CAE7r3MKUOGJ0v5-b5fYaF6sxKZvr0J-YXHTJf8u8GUr1tTcvNg%40mail.gmail.com


Best regards,
Arseniy Mukhin

Re: amcheck: support for GiST

От

Miłosz Bieniek

Дата:

09 декабря 2025 г., 21:54:57

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

Hi,
Together with Sergey we did a review and found a few things that need fixing:

- `contrib/amcheck/amcheck--1.5--1.6.sql:14` - missing space after comma
- `verify_gist.c` should be the second entry in meson.build, not at the end
- Some function arguments like `(GistCheckState * check_state, GistScanItem * stack)` have extra spaces after the `*` -
shouldbe `(GistCheckState *check_state, GistScanItem *stack)`
 
- Missing `#include "access/itup.h"` in `verify_common.h`
- Missing test file `007verify_gist_.pl` (not sure if should be created)
- `contrib/amcheck/sql/check_gist.sql` - missing cleanup statement `DROP TABLE toast_bug;`

Let us know if you need any clarification on these points!

- Miłosz and Sergey

Re: amcheck: support for GiST

От

Andrey Borodin

Дата:

16 декабря 2025 г., 15:02:52

Hi Miłosz and Sergey!

Thanks a lot for reviewing this patch!

> On 9 Dec 2025, at 23:54, Miłosz Bieniek <bieniek.milosz@proton.me> wrote:
>
> Hi,
> Together with Sergey we did a review and found a few things that need fixing:
>
> - `contrib/amcheck/amcheck--1.5--1.6.sql:14` - missing space after comma

Fixed.

> - `verify_gist.c` should be the second entry in meson.build, not at the end

Fixed.

> - Some function arguments like `(GistCheckState * check_state, GistScanItem * stack)` have extra spaces after the `*`
-should be `(GistCheckState *check_state, GistScanItem *stack)` 

Fixed.

> - Missing `#include "access/itup.h"` in `verify_common.h`

I do not understand why. Perhaps, optimizing headers would be a good idea. This file is not included by any of files
thatinclude verify_common.h. 

> - Missing test file `007verify_gist_.pl` (not sure if should be created)

I don't think we ever had it for gist.

> - `contrib/amcheck/sql/check_gist.sql` - missing cleanup statement `DROP TABLE toast_bug;`

Fixed.


Thanks!


Best regards, Andrey Borodin.

Вложения

Re: amcheck: support for GiST

От

Paul A Jungwirth

Дата:

16 декабря 2025 г., 18:24:33

Hello,

On Wed, Oct 22, 2025 at 11:58 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> > 1) There are several typos in verify_gist.c:
> >
> >  downlinks -> downlink (header comment)
> >  discrepencies -> discrepancies
> >  Correctess -> Correctness
> >  hande -> handle
> >  Initaliaze -> Initialize
> >  numbmer -> number
> >  replcaed -> replaced
> >  aquire -> aqcuire
> >
> > 2) Copyright year is 2023 in the patch. Time flies:)
>
> These two are (trivially) fixed.

I found a few more typos. Maybe one is left over from Arseniy's
review. Referencing the latest patch files from Andrey, here is what I
see:

in v20251216-0002-Add-gist_index_check-function-to-verify-Gi.patch:

> This function traverses GiST with a depth-fisrt search and checks

"fisrt" should be "first".

> This traverse takes lock of any page until some discapency found.

"discapency" should be "discrepancy"

> To re-check suspicious pair of parent and child tuples it aqcuires

"aqcuires" should be "acquires"

amcheck.sgml:

+      require tuple adjustement) and page graph respects balanced-tree

"adjustement" should be "adjustment"

Also the Makefile ordering is not quite right:

--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,16 +4,17 @@ MODULE_big    = amcheck
 OBJS = \
    $(WIN32RES) \
    verify_common.o \
+   verify_gist.o \
    verify_gin.o \
    verify_heapam.o \
    verify_nbtree.o

We should put verify_gist.o after verify_gin.o.

Yours,

--
Paul              ~{:-)
pj@illuminatedcomputing.com

Re: amcheck: support for GiST

От

Kirill Reshke

Дата:

18 декабря 2025 г., 14:40:14

On Tue, 16 Dec 2025 at 20:24, Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
>
> Hello,
>
> On Wed, Oct 22, 2025 at 11:58 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > > 1) There are several typos in verify_gist.c:
> > >
> > >  downlinks -> downlink (header comment)
> > >  discrepencies -> discrepancies
> > >  Correctess -> Correctness
> > >  hande -> handle
> > >  Initaliaze -> Initialize
> > >  numbmer -> number
> > >  replcaed -> replaced
> > >  aquire -> aqcuire
> > >
> > > 2) Copyright year is 2023 in the patch. Time flies:)
> >
> > These two are (trivially) fixed.
>
> I found a few more typos. Maybe one is left over from Arseniy's
> review. Referencing the latest patch files from Andrey, here is what I
> see:
>
> in v20251216-0002-Add-gist_index_check-function-to-verify-Gi.patch:
>
> > This function traverses GiST with a depth-fisrt search and checks
>
> "fisrt" should be "first".
>
> > This traverse takes lock of any page until some discapency found.
>
> "discapency" should be "discrepancy"
>
> > To re-check suspicious pair of parent and child tuples it aqcuires
>
> "aqcuires" should be "acquires"
>
> amcheck.sgml:
>
> +      require tuple adjustement) and page graph respects balanced-tree
>
> "adjustement" should be "adjustment"
>
> Also the Makefile ordering is not quite right:
>
> --- a/contrib/amcheck/Makefile
> +++ b/contrib/amcheck/Makefile
> @@ -4,16 +4,17 @@ MODULE_big    = amcheck
>  OBJS = \
>     $(WIN32RES) \
>     verify_common.o \
> +   verify_gist.o \
>     verify_gin.o \
>     verify_heapam.o \
>     verify_nbtree.o
>
> We should put verify_gist.o after verify_gin.o.
>
> Yours,
>
> --
> Paul              ~{:-)
> pj@illuminatedcomputing.com
>
>

Hi!


Thank you for taking a look. Sending new version which is Andreys's
[0] + 0003 applied + your review comments addressed + my changes,
including:


Commit message:

This function traverses GiST with a depth-first search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discrepancy found.
To re-check suspicious pair of parent and child tuples it acquires
locks on both parent and child pages in the same order as page
split does.

" discrepancy found" -> " discrepancy is found"
" re-check suspicious " -> " re-check a suspicious "

I also added you, Arseniy and  Miłosz in commit message, in reviewed-by section

+    /* Pointer to a next stack item. */
+    struct GistScanItem *next;
+}            GistScanItem;
+

a next -> the next


+        /*
+         * It's possible that the page was split since we looked at the
+         * parent, so that we didn't missed the downlink of the right sibling
+         * when we scanned the parent.  If so, add the right sibling to the
+         * stack now.
+         */

"didn't miss" not "didn't missed "


+            /*
+             * There was a discrepancy between parent and child tuples. We
+             * need to verify it is not a result of concurrent call of
+             * gistplacetopage(). So, lock parent and try to find downlink for
+             * current page. It may be missing due to concurrent page split,
+             * this is OK.

"find a downlink"


also this:

--- a/contrib/amcheck/verify_gist.c
+++ b/contrib/amcheck/verify_gist.c
@@ -583,7 +583,8 @@ gist_refind_parent(Relation rel,
 {
        Buffer          parentbuf;
        Page            parentpage;
-       OffsetNumber parent_maxoff;
+       OffsetNumber parent_maxoff,
+                                               off;
        IndexTuple      result = NULL;

        parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno,
RBM_NORMAL,
@@ -605,9 +606,9 @@ gist_refind_parent(Relation rel,
        }

        parent_maxoff = PageGetMaxOffsetNumber(parentpage);
-       for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o
= OffsetNumberNext(o))
+       for (off = FirstOffsetNumber; off <= parent_maxoff; off =
OffsetNumberNext(off))
        {
-               ItemId          p_iid = PageGetItemIdCareful(rel,
parentblkno, parentpage, o);
+               ItemId          p_iid = PageGetItemIdCareful(rel,
parentblkno, parentpage, off);
                IndexTuple      itup = (IndexTuple)
PageGetItem(parentpage, p_iid);

                if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)




--
Best regards,
Kirill Reshke

Вложения

Re: amcheck: support for GiST

От

Kirill Reshke

Дата:

01 января, 15:05:36

CF bot was unhappy about the last version due to obvious bug, PFA new
version with fixes.

The problem was "DROP TABLE toast_bug;" missing in expected regression output.


[0] https://cirrus-ci.com/task/6378051304423424


-- 
Best regards,
Kirill Reshke

Вложения

Re: amcheck: support for GiST

От

Kirill Reshke

Дата:

10 января, 17:56:28

On Thu, 1 Jan 2026 at 17:05, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> CF bot was unhappy about the last version due to obvious bug, PFA new
> version with fixes.
>
> The problem was "DROP TABLE toast_bug;" missing in expected regression output.
>
>
> [0] https://cirrus-ci.com/task/6378051304423424
>
>
> --
> Best regards,
> Kirill Reshke


Attached new version with commit message polishing, and address CF
feedback, which was unhappy due to headercheck

-- 
Best regards,
Kirill Reshke

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: amcheck: support for GiST

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения