Re: Making all nbtree entries unique by having heap TIDs participatein comparisons

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Дата
Msg-id CA+TgmoZEQDjuCW2R7qi8R8So-_4KAEEx951oEh4o7usahbtHkQ@mail.gmail.com
обсуждение исходный текст
Ответ на Making all nbtree entries unique by having heap TIDs participate in comparisons  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Thu, Jun 14, 2018 at 2:44 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> I've been thinking about using heap TID as a tie-breaker when
> comparing B-Tree index tuples for a while now [1]. I'd like to make
> all tuples at the leaf level unique, as assumed by L&Y. This can
> enable "retail index tuple deletion", which I think we'll probably end
> up implementing in some form or another, possibly as part of the zheap
> project. It's also possible that this work will facilitate GIN-style
> deduplication based on run length encoding of TIDs, or storing
> versioned heap TIDs in an out-of-line nbtree-versioning structure
> (unique indexes only). I can see many possibilities, but we have to
> start somewhere.

Yes, retail index deletion is essential for the delete-marking
approach that is proposed for zheap.

It could also be extremely useful in some workloads with the regular
heap.  If the indexes are large -- say, 100GB -- and the number of
tuples that vacuum needs to kill is small -- say, 5 -- scanning them
all to remove the references to those tuples is really inefficient.
If we had retail index deletion, then we could make a cost-based
decision about which approach to use in a particular case.

> mind now, while it's still swapped into my head. I won't do any
> serious work on this project unless and until I see a way to implement
> retail index tuple deletion, which seems like a multi-year project
> that requires the buy-in of multiple senior community members.

Can you enumerate some of the technical obstacles that you see?

> On its
> own, my patch regresses performance unacceptably in some workloads,
> probably due to interactions with kill_prior_tuple()/LP_DEAD hint
> setting, and interactions with page space management when there are
> many "duplicates" (it can still help performance in some pgbench
> workloads with non-unique indexes, though).

I think it would be helpful if you could talk more about these
regressions (and the wins).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: SCRAM with channel binding downgrade attack
Следующее
От: Vimalraj A
Дата:
Сообщение: pg_upgrade from 9.4 to 10.4