Re: Cost estimates for parameterized paths

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Cost estimates for parameterized paths
Дата	10 ноября 2011 г. 01:06:13
Msg-id	CA+Tgmob+ba4BEbV1yCH2yrQpdpj9=jTf3nP4jMrgy+Q9AFt7ow@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Cost estimates for parameterized paths (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Nov 9, 2011 at 5:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> More than a year ago, I wrote in
> http://archives.postgresql.org/message-id/14624.1283463072@sss.pgh.pa.us
>
>> Awhile back I ranted about replacing the planner's concept of inner
>> indexscans with a more generalized notion of "parameterized paths":
>> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00994.php
>
>> The executor fixes for that are done, and now I'm grappling with getting
>> the planner to do something useful with it.  The biggest problem I've run
>> into is that a parameterized path can't really be assigned a fixed cost
>> in the same way that a normal path can.  The current implementation of
>> cost_index() depends on knowing the size of the outer relation --- that
>> is, the expected number of execution loops for the indexscan --- in order
>> to account for cache effects sanely while estimating the average cost of
>> any one inner indexscan.
>
> Since this project has been stalled for so long, I am thinking that what
> I need to do to make some progress is to punt on the repeated-execution
> cache effects problem, at least for the first cut.  I propose costing
> parameterized inner paths on the "worst case" basis that they're only
> executed once, and don't get any benefit from caching across repeated
> executions.  This seems like a reasonably sane first-order approximation
> on two grounds:
>
> 1. In most of the cases where such a plan is of interest, the outer
> relation for the nestloop actually does provide only one or a few rows.
> If it generates a lot of rows, you probably don't want a nestloop
> anyhow.
>
> 2. In the cases where we really care, the alternatives are so much worse
> that the parameterized nestloop will win even if it's estimated very
> conservatively.

I agree, on all counts.  Errors that make new planner possibilities
look unduly expensive aren't as serious as those that go the other
way, because the alternative is that you can never generate the new
plan at all.  That's why I'm sweating about the costing index-only
scans a bit.

> Another thing in the back of my mind is that the whole issue of cache
> effects is something we know we don't model very well, so putting large
> amounts of time into enlarging the present approach to handle more
> complicated plan structures may be misplaced effort anyway.

True.  And I think that might not even be the highest priority project
to tackle anyway.  The things that are hurting people most routinely
and hardest to fix with existing tools seem to be things like
cross-column correlation, and other selectivity estimation errors.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Josh Berkus
Дата: 10 ноября 2011 г., 01:04:07
Сообщение: Re: 9.1.2 ?

Следующее

От: Robert Haas
Дата: 10 ноября 2011 г., 01:10:01
Сообщение: Re: heap vacuum & cleanup locks

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Cost estimates for parameterized paths

Предыдущее

Следующее