Re: BUG #16280: dead tuples (probably) effect plan and query performance

Поиск
Список
Период
Сортировка
От Serbin, Ilya
Тема Re: BUG #16280: dead tuples (probably) effect plan and query performance
Дата
Msg-id CALTXVijBNY+yqvDfRCmL_2ZAVmJN_hEvJvWLJ2mxJiFoTD=oZA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #16280: dead tuples (probably) effect plan and query performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #16280: dead tuples (probably) effect plan and query performance  ("Serbin, Ilya" <iserbin@bostonsd.ru>)
Список pgsql-bugs
Hello, Tom!
Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is being generated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries).
In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays good until number of dead tuples reaches 300-400, аfter that plan changes to the bad one.
As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad. Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from   random_page_cost=0.1 and  seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same even when I set random_page_cost=30;set seq_page_cost=1;
I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn't seem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now).

Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing something and dead tuples has nothing to do with it?

Thanks in advance,
Ilya

чт, 27 февр. 2020 г. в 18:41, Tom Lane <tgl@sss.pgh.pa.us>:
PG Bug reporting form <noreply@postgresql.org> writes:
> Issue description:
> After some minimal activity in database first plan changes to the second.
> Analyze on table1 do not help (tried with various default_statistics_target
> values).
> content_idx index recreation helps for some time, but several minutes later
> plan degrades back to second one.
> The only thing helped (surprisingly) is vacuum. It also helps for some time,
> but once number of dead tuples reaches something  like 300-500 - plan
> changes back to second one.

Surely it is *not* a bug that dead tuples affect the plan choice.
The density of live tuples is an important factor in the relative
costs of different table scan techniques.

In the case at hand, I wonder why your rowcount estimate is off
by a factor of 50:

->  Bitmap Index Scan on content_idx  (cost=0.00..155.07 rows=409
width=0) (actual time=4.932..4.932 rows=21952 loops=1)
       Index Cond: (content @> '{"anotherjsonkey": {"values": ["13"]}}'::jsonb)
       Buffers: shared hit=48

If you can't improve that you're not likely to get a good plan, and
futzing around with cost factors to make this particular query do
"the right thing" anyway is inevitably going to make things worse
for other queries.  Maybe a larger stats target for the content column
would help, but I fear that this @> condition is just beyond
the ability of the planner to estimate.  You might need to redesign
the data representation to make it a bit more SQL-friendly.

                        regards, tom lane


--
Илья Сербин | Старший Администратор БД
iserbin@bostonsd.ru
Мобильный : 8-918-895-05-96 | Telegram : @iserbin_61
DBI 24/7 контакт : 8-800-333-65-79 | Сайт : http://dbi.ru/

В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16282: Avoid sql-injections at identifiers
Следующее
От: Sandeep Thakkar
Дата:
Сообщение: Re: BUG #16274: Repeated Libraries in Mac