Re: Planning large IN lists

Поиск
Список
Период
Сортировка
От Atul Deopujari
Тема Re: Planning large IN lists
Дата
Msg-id 464CA680.7040904@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Planning large IN lists  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Planning large IN lists  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

Tom Lane wrote:
> "Atul Deopujari" <atul.deopujari@enterprisedb.com> writes:
>> Hi,
>> Tom Lane wrote:
>>> That's the least of the problems.  We really ought to convert such cases
>>> into an IN (VALUES(...)) type of query, since often repeated indexscans
>>> aren't the best implementation.
>>>
>> I thought of giving this a shot and while I was working on it, it 
>> occurred to me that we need to decide on a threshold value of the IN 
>> list size above which such transformation should take place.
> 
> I see no good reason to suppose that there is/should be a constant
> threshold --- most likely it depends on size of table, availability of
> indexes, etc.  Having the planner try it both ways and compare costs
> would be best.
> 
Yes, letting the planner make its own decision would seem best (in 
accordance with what we do for different join paths). But for large IN 
lists, a substantial part of the planner is spent in estimating the 
selectivity of the ScalarArrayExpr by calling scalararraysel. If we are 
not eliminating this step in processing the IN list then we are not 
doing any optimization. Asking the planner to do scalararraysel and also 
compute cost of any other way and choose between the two is asking 
planner to do more work.

Factors such as size of table, availability of index etc. would affect 
both the ways similarly. So, if we see a gain in the execution of the IN 
list due to an external factor then we will also see a similar gain in 
the execution of the transformed IN (VALUES(...)) clause.

I agree that one value would not fit all cases. The problem with this 
approach is that for some cases, large IN list would perform better than 
the transformed IN (VALUES(...)) clause. But we know that the 
transformed IN (VALUES(...)) clause has almost a steady state behavior 
and it would not blow off the planner estimates. The error would be just 
marginal.

--
Atul

EnterpriseDB
www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Patch queue triage
Следующее
От: "Joshua D. Drake"
Дата:
Сообщение: Async commands (like drop index)