Re: Index tuple deduplication limitations in pg13

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Index tuple deduplication limitations in pg13
Дата
Msg-id CAH2-WznUCG6H-xkq5hLxpLu8+61n70n4Ls0RuFmfLc1vDXa0wg@mail.gmail.com
обсуждение исходный текст
Ответ на Index tuple deduplication limitations in pg13  (Matthias van de Meent <matthias.vandemeent@cofano.nl>)
Список pgsql-general
On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent
<matthias.vandemeent@cofano.nl> wrote:
> Deduplication does not need to destroy semantic differences? 'equal'
> can (in my book) mean:
> - 'opclass-equal', that is the opclass returns true for an equality check
> - 'binary equal' or 'datum-equal' (? maybe incorrect term), that is
> the unprocessed on-disk representations (datum image is the right term
> I believe?) of the compared values are indistinguishable.
>
> Runs of 'binary equal' datums can be freely deduplicated [0] when found.

> [0]
> Inserting a row in a deduplicated index with in, with TID ntid, can
> encounter a posting list of a opclass-equal but not datum image-equal
> tuples where the lowest TID of the posting list is less than ntid, and
> ntid is less than the highest TID of the posting list. This would
> require a posting list split to accomodate the new tuples' index entry
> in order to not lose data.

But you can't do that easily, because it breaks subtle assumptions
about posting list splits and space utilization. In particular, it
means that you can no longer think of a posting list split as
rewriting an incoming new item such that you can more or less pretend
that there was no overlap in the first place -- code like _bt_split
and nbtsplitloc.c relies on this. Introducing new special cases to
nbtsplitloc.c is very unappealing.

More concretely, if you introduce a posting list split like this then
you need three copies of the key -- the original, the new, and a
second copy of the original. That's much more complicated.

--
Peter Geoghegan



В списке pgsql-general по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: Index tuple deduplication limitations in pg13
Следующее
От: Jason Myers
Дата:
Сообщение: Orphaned relations after crash/sigkill during CREATE TABLE