Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions
Дата
Msg-id CAApHDvrtTKfh7HgAyXBd3KN0s-jxiHzW7sWdm-sFEjP6fGPCkg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions  (Dimitrios Apostolou <jimis@gmx.net>)
Ответы Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions
Список pgsql-general
On Sat, 11 May 2024 at 13:11, Dimitrios Apostolou <jimis@gmx.net> wrote:
> Indeed that's an awful estimate, the table has more than 1M of unique
> values in that column. Looking into pg_stat_user_tables, I can't see the
> partitions having been vacuum'd or analyzed at all. I think they should
> have been auto-analyzed, since they get a ton of INSERTs
> (no deletes/updates though) and I have the default autovacuum settings.
> Could it be that autovacuum starts, but never
> finishes? I can't find something in the logs.

It's not the partitions getting analyzed you need to worry about for
an ndistinct estimate on the partitioned table. It's auto-analyze or
ANALYZE on the partitioned table itself that you should care about.

If you look at [1], it says "Tuples changed in partitions and
inheritance children do not trigger analyze on the parent table."

> In any case, even after the planner decides to execute the terrible plan
> with the parallel seqscans, why doesn't it finish right when it finds 10
> distinct values?

It will. It's just that Sorting requires fetching everything from its subnode.

David

[1] https://www.postgresql.org/docs/16/routine-vacuuming.html#VACUUM-FOR-STATISTICS



В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions
Следующее
От: David Rowley
Дата:
Сообщение: Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions