Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
Дата	3 августа 2018 г. 02:32:46
Msg-id	CAH2-WzkL8dE2AYvPQ4LVmj7Q_9U1dyHqn7Ope7J75sjOJWPZrQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters) (Simon Riggs <simon@2ndquadrant.com>)
Ответы	Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncationmatters) Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Jul 17, 2018 at 10:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> If we knew that we were never going to do deletes/non-HOT updates from
> the table we could continue to use the existing mechanism, otherwise
> we will be better off to use sorted index entries. However, it does
> appear as if keeping entries in sorted order would be a win on
> concurrency from reduced block contention on the first few blocks of
> the index key, so it may also be a win in cases where there are heavy
> concurrent inserts but no deletes.

I think so too. I also expect a big reduction in the number of FPIs in
the event of many duplicates.

> I hope we can see a patch that just adds the sorting-by-TID property
> so we can evaluate that aspect before we try to add other more
> advanced index ideas.

I can certainly see why that's desirable. Unfortunately, it isn't that simple.

If I want to sort on heap TID as a tie-breaker, I cannot cut any
corners. That is, it has to be just another column, at least as far as
the implementation is concerned (heap TID will have a different
representation in internal pages and leaf high keys, but nonetheless
it's essentially just another column in the keyspace). This means that
if I don't have suffix truncation, I'll regress performance in many
important cases that have no compensating benefit (e.g. pgbench).
There is no point in trying to assess that.

It is true that I could opt to only "logically truncate" the heap TID
attribute during a leaf page split (i.e. there'd only be "logical
truncation", which is to say there'd only be the avoidance of adding a
heap TID to the new high key, and never true physical truncation of
user attributes). But doing only that much saves very little code,
since the logic for assessing whether or not it's safe to avoid adding
a new heap attribute (whether or not we logically truncate) still has
to involve an insertion scankey. It seems much more natural to do
everything at once. Again, the heap TID attribute is more or less just
another attribute. Also, the code for doing physical suffix truncation
already exists from the covering/INCLUDE index commit.

I'm currently improving the logic for picking a page split in light of
suffix truncation, which I've been working on for weeks now. I had
something that did quite well with the initial index sizes for TPC-C
and TPC-H, but then realized I'd totally regressed the motivating
example with many duplicates that I started this thread with. I now
have something that does both things well, which I'm trying to
simplify. Another thing to bear in mind is that page split logic for
suffix truncation also helps space utilization on the leaf level. I
can get the main TPC-C order_line pkey about 7% smaller with true
suffix truncation, even though the internal page index tuples can
never be any smaller due to alignment, and even though there are no
duplicates that would otherwise make the implementation "get tired".

Can I really fix space utilization in a piecemeal fashion? I strongly doubt it.

-- 
Peter Geoghegan

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 03 августа 2018 г., 02:30:42
Сообщение: Re: Alter index rename concurrently to

Следующее

От: Andres Freund
Дата: 03 августа 2018 г., 02:44:41
Сообщение: Re: Alter index rename concurrently to

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)

Предыдущее

Следующее