Re: amcheck verification for GiST

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: amcheck verification for GiST
Дата
Msg-id B4847A19-72B2-4900-A580-E28AD3246C75@yandex-team.ru
обсуждение исходный текст
Ответ на Re: amcheck verification for GiST  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: amcheck verification for GiST  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Hi, Peter!

Thank you for the review!

> 7 дек. 2018 г., в 3:59, Peter Geoghegan <pg@bowt.ie> написал(а):
>
> On Sun, Sep 23, 2018 at 10:12 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> * You do this:
>
>> +/* Check of an internal page. Hold locks on two pages at a time (parent+child). */
>
> This isn't consistent with what you do within verify_nbtree.c, which
> deliberately avoids ever holding more than a single buffer lock at a
> time, on general principle. That isn't necessarily a reason why you
> have to do the same, but it's not clear why you do things that way.
> Why isn't it enough to have a ShareLock on the relation? Maybe this is
> a sign that it would be a good idea to always operate on a palloc()'d
> copy of the page, by introducing something equivalent to
> palloc_btree_page(). (That would also be an opportunity to do very
> basic checks on every page.)
If we unlock parent page before checking child, some insert can adjust tuple on parent, sneak into child and insert new
tuple.
This can trigger false positive. I'll think about it more.

> * You need to sprinkle a few CHECK_FOR_INTERRUPTS() calls around.
> Certainly, there should be one at the top of the main loop.
I've added check into main loop of the scan. All deeper paths hold buffer locks.

> * Maybe gist_index_check() should be called gist_index_parent_check(),
> since it's rather like the existing verification function
> bt_index_parent_check().
>
> * Alternatively, you could find a way to make your function only need
> an AccessShareLock -- that would make gist_index_check() an
> appropriate name. That would probably require careful thought about
> VACUUM.

I've changed lock level to AccessShareLock. IMV scan is just as safe as regular GiST index scan.
There is my patch with VACUUM on CF, it can add deleted pages. I'll update one of these two patches accordingly, if
otheris committed.  

>
> * Why is it okay to do this?:
>
>> +       if (GistTupleIsInvalid(idxtuple))
>> +           ereport(LOG,
>> +                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
>> +                           RelationGetRelationName(rel)),
>> +                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to
PostgreSQL9.1."), 
>> +                    errhint("Please REINDEX it.")));
>
> You should probably mention the gistdoinsert() precedent for this.
This invalid tuple will break inserts, but will not affect select. I do not know, should this be error or warning in
amcheck?

> * Can we check GIST_PAGE_ID somewhere? I try to be as paranoid as
> possible, adding almost any check that I can think of, provided it
> hasn't got very high overhead. Note that gistcheckpage() doesn't do
> this for you.
Done. I think that gistcheckpage() should do this too, but I think we should avoid interventions into GiST mechanics
here.
>
> * Should we be concerned about the memory used by pushStackIfSplited()?
Memory is pfree()`d as usual. Tree scan code is almost equivalent to VACUUM`s gistbulkdelete.
>
> * How about a cross-check between IndexTupleSize() and
> ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
> we have this redundancy, which wastes space, but we do, so we might as
> well get some small benefit from it.
Done. I'm checking it MAXALIGNED, this rounding seems correct.


Please find attached v3.


Best regards, Andrey Borodin.


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nguyễn Trần Quốc Vinh
Дата:
Сообщение: Re: Implementing Incremental View Maintenance
Следующее
От: Michael Banck
Дата:
Сообщение: Re: Offline enabling/disabling of data checksums