Re: BUG #16280: dead tuples (probably) effect plan and query performance

Поиск

Список

Период

Сортировка

От	Serbin, Ilya
Тема	Re: BUG #16280: dead tuples (probably) effect plan and query performance
Дата	28 февраля 2020 г. 09:25:22
Msg-id	CALTXVijBNY+yqvDfRCmL_2ZAVmJN_hEvJvWLJ2mxJiFoTD=oZA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #16280: dead tuples (probably) effect plan and query performance (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: BUG #16280: dead tuples (probably) effect plan and query performance
Список	pgsql-bugs

Дерево обсуждения

Hello, Tom!

Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is being generated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries).

In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays good until number of dead tuples reaches 300-400, аfter that plan changes to the bad one.

As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad. Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from random_page_cost=0.1 and seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same even when I set random_page_cost=30;set seq_page_cost=1;

I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn't seem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now).

Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing something and dead tuples has nothing to do with it?

Thanks in advance,

Ilya

чт, 27 февр. 2020 г. в 18:41, Tom Lane <tgl@sss.pgh.pa.us>:

PG Bug reporting form <noreply@postgresql.org> writes:
> Issue description:
> After some minimal activity in database first plan changes to the second.
> Analyze on table1 do not help (tried with various default_statistics_target
> values).
> content_idx index recreation helps for some time, but several minutes later
> plan degrades back to second one.
> The only thing helped (surprisingly) is vacuum. It also helps for some time,
> but once number of dead tuples reaches something like 300-500 - plan
> changes back to second one.

Surely it is *not* a bug that dead tuples affect the plan choice.
The density of live tuples is an important factor in the relative
costs of different table scan techniques.

In the case at hand, I wonder why your rowcount estimate is off
by a factor of 50:

-> Bitmap Index Scan on content_idx (cost=0.00..155.07 rows=409
width=0) (actual time=4.932..4.932 rows=21952 loops=1)
Index Cond: (content @> '{"anotherjsonkey": {"values": ["13"]}}'::jsonb)
Buffers: shared hit=48

If you can't improve that you're not likely to get a good plan, and
futzing around with cost factors to make this particular query do
"the right thing" anyway is inevitably going to make things worse
for other queries. Maybe a larger stats target for the content column
would help, but I fear that this @> condition is just beyond
the ability of the planner to estimate. You might need to redesign
the data representation to make it a bit more SQL-friendly.

regards, tom lane

Илья Сербин | Старший Администратор БД

iserbin@bostonsd.ru
Мобильный : 8-918-895-05-96 | Telegram : @iserbin_61
DBI 24/7 контакт : 8-800-333-65-79 | Сайт : http://dbi.ru/

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #16280: dead tuples (probably) effect plan and query performance