Re: Improve search for missing parent downlinks in amcheck

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Improve search for missing parent downlinks in amcheck
Дата
Msg-id CAH2-Wzmv444fW0XCrXM=STS23p2XADufgEoH37SrO=v9HJv5_w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Improve search for missing parent downlinks in amcheck  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Список pgsql-hackers
On Sat, Apr 27, 2019 at 5:13 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Yes, increasing of Bloom filter size also helps.  But my intention was
> to make non-lossy check here.

I agree that that might be a good goal, but I am interested in knowing
if there is something naive about how the downlinkfilter Bloom filter
is sized. I think that we could probably do better than this without
much work:

            /*
             * Extra readonly downlink check.
             *
             * In readonly case, we know that there cannot be a concurrent
             * page split or a concurrent page deletion, which gives us the
             * opportunity to verify that every non-ignorable page had a
             * downlink one level up.  We must be tolerant of interrupted page
             * splits and page deletions, though.  This is taken care of in
             * bt_downlink_missing_check().
             */
            total_pages = (int64) state->rel->rd_rel->relpages;
            state->downlinkfilter = bloom_create(total_pages, work_mem, seed);

Maybe we could use "smgrnblocks(index->rd_smgr, MAIN_FORKNUM))"
instead of relpages, for example.

> It wouldn't be really wasteful, because the same children were
> accessed recently.  So, they are likely not yet evicted from
> shared_buffers.  I think we can fit both checks into one loop over
> downlinks instead of two.

I see your point, but if we're going to treat this as a bug then I
would prefer a simple fix.

> Yes, we can do more checks with bloom filter.  But assuming that we
> anyway iterate over children for each non-leaf page, can we do:
> 1) All the checks, which bt_downlink_check() does for now
> 2) Check there are no missing downlinks
> 3) Check hikeys
> in one pass, can't we?

We can expect every high key in a page to have a copy contained within
its parent, either as one of the keys, or as parent's own high key
(assuming no concurrent or interrupted page splits or page deletions).
This is true today, even though we truncate negative infinity items in
internal pages.

I think that the simple answer to your question is yes. It would be
more complicated that way, and the only extra check would be the check
of high keys against their parent, but overall this does seem
possible.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: Improve search for missing parent downlinks in amcheck
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Race conditions with checkpointer and shutdown