Re: Non-deterministic IndexTuple toast compression fromindex_form_tuple() + amcheck false positives

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: Non-deterministic IndexTuple toast compression fromindex_form_tuple() + amcheck false positives
Дата	2 февраля 2019 г. 05:27:51
Msg-id	CAH2-WznJZXUb_4ZN+e_W7U3rCHUne0TCzX4hK0dY6+VoD6onMw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Non-deterministic IndexTuple toast compression fromindex_form_tuple() + amcheck false positives (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: Non-deterministic IndexTuple toast compression fromindex_form_tuple() + amcheck false positives
Список	pgsql-hackers

Дерево обсуждения

On Wed, Jan 23, 2019 at 10:59 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > The fix here must be to normalize index tuples that are compressed
> > within amcheck, both during initial fingerprinting, and during
> > subsequent probes of the Bloom filter in bt_tuple_present_callback().
>
> I happened to talk to Andres about this in person yesterday. He
> thought that there was reason to be concerned about the need for
> logical normalization beyond TOAST issues. Expression indexes were a
> particular concern, because they could in principle have a change in
> the on-disk representation without a change of logical values -- false
> positives could result. He suggested that the long term solution was
> to bring hash operator class hash functions into Bloom filter hashing,
> at least where available.

I think that the best way forward is to normalize to compensate for
inconsistent input datum TOAST state, and leave it at that. ISTM that
logical normalization beyond that (based on hashing, or anything else)
creates more problems than it solves. I am concerned about cases like
INCLUDE indexes (which may have datums that lack even a B-Tree
opclass), and about the logical-though-semantically-relevant facets of
some datatypes such as numeric's display scale. If I can get an
example from Andres of a case where further logical normalization is
necessary to avoid false positives with expression indexes, that may
change things. (BTW, I implemented another amcheck enhancement that
searches indexes from the root to find matches -- the code is a
trivial addition to the new patch series I'm working on, and seems
like a better way to do enhanced logical normalization if that proves
to be truly necessary.)

Attached draft patch fixes the bug by doing fairly simple
normalization. I think that TOAST compression of datums in indexes is
fairly rare in practice, so I'm not very worried about the fact that
this won't perform as well as it could with indexes that have a lot of
compressed datums. I think that the interface I've added might need to
be expanded for other things in the future (e.g., to make amcheck work
with nbtree-native duplicate compression), and not worrying about the
performance too much helps with that goal.

I'll pick this up next week, and likely commit a fix by Wednesday or
Thursday if there are no objections. I'm not sure if the test case is
worth including.

-- 
Peter Geoghegan

Вложения

0001-Avoid-amcheck-TOAST-compression-inconsistencies.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Amit Kapila
Дата: 02 февраля 2019 г., 05:00:18
Сообщение: Re: WIP: Avoid creation of the free space map for small tables

Следующее

От: Alvaro Herrera
Дата: 02 февраля 2019 г., 05:31:51
Сообщение: Re: [Patch] Log10 and hyperbolic functions for SQL:2016 compliance

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Non-deterministic IndexTuple toast compression fromindex_form_tuple() + amcheck false positives

Вложения

Предыдущее

Следующее