Re: Feature request for adoptive indexes

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Feature request for adoptive indexes
Дата
Msg-id CA+Tgmoas=YhvX7uSNnSt27wxZwGD2RfZd7hNLZ166YeFzO7MBA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Feature request for adoptive indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Feature request for adoptive indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On Tue, Oct 26, 2021 at 11:11 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> If I get what you propose, you want to have a "top" tree for (job, nlp,
> year), which "splits" the data set into subsets of ~5000-7000 rows. And
> then for each subset you want a separate "small" trees on each of the
> other columns, so in this case three trees.
>
> Well, the problem with this is pretty obvious - each of the small trees
> requires separate copies of the leaf pages. And remember, in a btree the
> internal pages are usually less than 1% of the index, so this pretty
> much triples the size of the index. And if you insert a row into the
> index, it has to insert the item pointer into each of the small trees,
> likely requiring a separate I/O for each.
>
> So I'd bet this is not any different from just having three separate
> indexes - it doesn't save space, doesn't save I/O, nothing.

I agree. In a lot of cases, it's actually useful to define the index
on fewer columns, like (job, nlp, year) or even just (job, nlp) or
even just (job) because it makes the index so much smaller and that's
pretty important. If you have enough duplicate entries in a (job, nlp,
year) index to justify create a (job, nlp, year, sequence) index, the
number of distinct (job, nlp, year) tuples has to be small compared to
the number of (job, nlp, year, sequence) tuples - and that means that
you wouldn't actually save much by combining your (job, nlp, year,
sequence) index with a (job, nlp, year, other-stuff) index. As you
say, the internal pages aren't the problem.

I don't intend to say that there's no possible use for this kind of
technology. Peter G. says that people are writing research papers
about that and they probably wouldn't be doing that unless they'd
found some case where it's a big win. But this example seems extremely
unconvincing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)
Следующее
От: Jesper Pedersen
Дата:
Сообщение: Re: Portability report: ninja