Re: amcheck verification for GiST

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: amcheck verification for GiST
Дата
Msg-id CAH2-Wzk-Qq1f-_XNFxL-dOwaZFKdNZApqBU6h6FEgnJh+u-8uA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: amcheck verification for GiST  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: amcheck verification for GiST  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers
On Sun, Sep 23, 2018 at 10:12 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> (0001-GiST-verification-function-for-amcheck-v2.patch)

Thanks for working on this. Some feedback:

* You do this:

> +/* Check of an internal page. Hold locks on two pages at a time (parent+child). */

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

* You need to sprinkle a few CHECK_FOR_INTERRUPTS() calls around.
Certainly, there should be one at the top of the main loop.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

* Why is it okay to do this?:

> +       if (GistTupleIsInvalid(idxtuple))
> +           ereport(LOG,
> +                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
> +                           RelationGetRelationName(rel)),
> +                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to
PostgreSQL9.1."),
 
> +                    errhint("Please REINDEX it.")));

You should probably mention the gistdoinsert() precedent for this.

* Can we check GIST_PAGE_ID somewhere? I try to be as paranoid as
possible, adding almost any check that I can think of, provided it
hasn't got very high overhead. Note that gistcheckpage() doesn't do
this for you.

* Should we be concerned about the memory used by pushStackIfSplited()?

* How about a cross-check between IndexTupleSize() and
ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
we have this redundancy, which wastes space, but we do, so we might as
well get some small benefit from it.

--
Peter Geoghegan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: psql display of foreign keys
Следующее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: Use durable_unlink for .ready and .done files for WAL segmentremoval