Re: a wrong index choose when statistics is out of date

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: a wrong index choose when statistics is out of date
Дата
Msg-id 87msra9vgo.fsf@163.com
обсуждение исходный текст
Ответ на Re: a wrong index choose when statistics is out of date  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: a wrong index choose when statistics is out of date  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:

> On Wed, 6 Mar 2024 at 02:09, Andy Fan <zhihuifan1213@163.com> wrote:
>>     This patch introduces a new attoptions like this:
>>
>>     ALTER TABLE t ALTER COLUMN col set (force_generic=true);
>>
>>     Then selfunc.c realizes this and ignore the special Const value, then
>>     average selectivity is chosen. This fall into the weakness of generic
>>     plan, but this patch doesn't introduce any new weakness and we leave the
>>     decision to user which could resolve some problem. Also this logic only
>>     apply to eqsel since the ineqsel have the get_actual_variable_range
>>     mechanism which is helpful for index choose case at least.
>
> If you don't want the planner to use the statistics for the column why
> not just do the following?

Acutally I didn't want the planner to ignore the statistics totally, I
want the planner to treat the "Const" which probably miss optimizer part
average, which is just like what we did for generic plan for the blow
query.  

prepare s as SELECT * FROM t WHERE a = $1 and b = $2;
explain (costs off) execute s(109, 8);
           QUERY PLAN            
---------------------------------
 Index Scan using t_a_c_idx on t
   Index Cond: (a = 109)
   Filter: (b = 8) 

(3 rows)

custom plan, Wrong index due to we have a bad estimation for a = 109.


set plan_cache_mode to force_generic_plan ;
explain (costs off) execute s(109, 8);
              QUERY PLAN               
---------------------------------------
 Index Scan using t_a_b_idx on t
   Index Cond: ((a = $1) AND (b = $2))   -- Correct index.
(2 rows)

Generic plan - we use the average estimation for the missed optimizer
statistics part and *if the new value is not so different from existing
ones*, we can get a disired result. 

It is true that the "generic" way is not as exactly accurate as the
"custom" way since the later one can use the data in MCV, but that is
the cost we have to pay to make the missed optimizer statistics less
imporant and generic plan has the same issue as well. As for this
aspect, I think the way you proposed probably have a wider use case.

-- 
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jelte Fennema-Nio
Дата:
Сообщение: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Следующее
От: Dean Rasheed
Дата:
Сообщение: Re: bug report: some issues about pg_15_stable(8fa4a1ac61189efffb8b851ee77e1bc87360c445)