Re: postgres chooses objectively wrong index

Поиск

Список

Период

Сортировка

От	Alexey Ermakov
Тема	Re: postgres chooses objectively wrong index
Дата	18 марта 08:27:26
Msg-id	381e7d47-8021-458d-9a64-0cdb80b004f0@gmail.com обсуждение
Ответ на	Re: postgres chooses objectively wrong index (Merlin Moncure <mmoncure@gmail.com>)
Ответы	Re: postgres chooses objectively wrong index
Список	pgsql-performance

Дерево обсуждения

On 2026-03-18 04:52, Merlin Moncure wrote:

On Tue, Mar 17, 2026 at 4:16 PM Alexey Ermakov <alexius.work@gmail.com> wrote:
On 2026-03-18 03:01, Merlin Moncure wrote:
I've been maintaining an airflow style orchestrator in pl/pgsql, and it's revealed a performance issue I just can't solve. There is a table, task, which may normally contain billions of rows, but only a tiny portion is interesting for specific reasons—a common pattern in task-type systems.

...

I'm wondering if there are other tricks that might apply here, for example, multi column index statistics...curious if anyone has thoughts on that.

Any suggestions?

merlin

Hello. I think planner doesn't have information about distribution of async.task_execution_state(task) unless it's part of any full index. I would try to give that with extended statistics (postgresql 14+):
create statistics (mcv) task_task_execution_state_stat on ((async.task_execution_state(task))) from async.task;
analyze async.task;
If that won't help - please show distribution from pg_stats_ext view for extended statistic above.
This unfortunately fails, probably because the table type includes system columns (despite not using them).

orchestrator_service_user@orchestrator=> create statistics task_stats (mcv) on (async.task_execution_state(task)) from async.task;
ERROR: statistics creation on system columns is not supported

This would require some refactoring to fix.

Interesting... In that case functional index should help (as it also makes statistic for the planner):

create index concurrently on task_task_execution_state_idx async.task using btree ((async.task_execution_state(task)));

analyze async.task;

Perhaps multicolumn index will also help for queries but hard to say without knowing distributions. We could check state distribution info after index creation and analyze with query like this:

select * from pg_stats where tablename = 'task_task_execution_state_idx' \gx

Alexey Ermakov

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: postgres chooses objectively wrong index