Re: BUG #16280: dead tuples (probably) effect plan and query performance

Поиск
Список
Период
Сортировка
От James Coleman
Тема Re: BUG #16280: dead tuples (probably) effect plan and query performance
Дата
Msg-id CAAaqYe96jM14OuvjYh2bb2EyR=bCjWGwB4sVEFFM_sVMhHfB6g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #16280: dead tuples (probably) effect plan and query performance  ("Serbin, Ilya" <iserbin@bostonsd.ru>)
Ответы Re: BUG #16280: dead tuples (probably) effect plan and query performance  ("Serbin, Ilya" <iserbin@bostonsd.ru>)
Список pgsql-bugs
On Wed, Mar 11, 2020 at 6:51 AM Serbin, Ilya <iserbin@bostonsd.ru> wrote:
>
> Hello,
> Asking again just in case someone can help to find out why is that happening in my case and if it is a bug.
>
> пт, 28 февр. 2020 г. в 12:25, Serbin, Ilya <iserbin@bostonsd.ru>:
>>
>> Hello, Tom!
>> Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is
beinggenerated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries). 
>> In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays
gooduntil number of dead tuples reaches 300-400, аfter that plan changes to the bad one. 
>> As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays
bad.Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from
random_page_cost=0.1 and  seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same
evenwhen I set random_page_cost=30;set seq_page_cost=1; 
>> I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it
doesn'tseem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now). 
>>
>> Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing
somethingand dead tuples has nothing to do with it? 

That looks like an almost impossible query to plan correctly -- the
planner would need some kind of stats on fields internal to the json
blob, but it can't have that since the blob is necessarily schema less
(at least from PG's perspective). That's why Tom is wondering about
making the representation map more closely to a SQL/relational table
model with discrete columns.

Alternatively have you tried adding function indexes on the (it looks
like) ~2 paths you're querying in the JSON field?

It's not impossible there's a bug in the planner here, but the
opaqueness of this query to the planner is why that's not anyone's
first assumption. To show otherwise someone would have to step through
the execution and show that it's not just reaching some cutoff (that
makes somewhat reasonable sense for how hard this query is to plan) or
that the cutoff is being calculated incorrectly.

James



В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Serbin, Ilya"
Дата:
Сообщение: Re: BUG #16280: dead tuples (probably) effect plan and query performance
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Optimizer Doesn't Push Down Where Expressions on Rollups