Re: Common Table Expressions applied; some issues remain

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Common Table Expressions applied; some issues remain
Дата
Msg-id 603c8f070810062118u5fad359dlbb1bfc2d129ace2a@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Common Table Expressions applied; some issues remain  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> If the planner chooses to do this as a nestloop with table2 on the
> inside, then expensive_function() can get evaluated multiple times on
> the same row of table2.  We really don't make very many guarantees about
> what will happen with functions inside inlined views, even with "offset
> 0" as an optimization fence.  So I was thinking that taking a strong
> reading of the spec's wording about single evaluation of WITH clauses
> might provide useful leverage for people who need to control evaluation
> of expensive or volatile functions better than they can now.

+1 for a strong reading. I think the ability to prevent multiple
evaluations of expensive functions is key here.

> Another possibility that we could think about is: if a CTE is only
> referenced once, then push down any restriction clauses that are
> available at the single call site, but still execute it using the
> CteScan materialization logic.  The evaluation guarantee would then
> look like "no row of the CTE's result is evaluated twice, but some rows
> might not be evaluated at all".

Assuming a perfectly intelligent optimizer, the only advantage of the
=1 guarantee over the <=1 guarantee is that you can evaluate the
entire CTE for side-effects and then fetch back only a subset of the
data to return to the user.  This seems likely to be a pretty rare use
case, though, and the rest of the time you'd presumably prefer for
performance reasons to have as little of the CTE as possible
executed.... so +1 for <=1.

> What we'd pay for this is that the CTE
> could not be the inside of a nestloop with inner indexscan using a join
> condition, since we don't have any way to keep track of which rows were
> already fetched in that case.

Is it not possible to consider both plans?  That is, compare the cost
of evaluating every row and then doing a nestloop with inner indexscan
versus using some other plan and evaluating only the rows meeting the
quals?

As a side note, in theory, I think you could generalize this to CTEs
with multiple call sites by taking the logical OR of the available
quals.  This might not be worth it, though unless the quals are highly
selective.

...Robert


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Common Table Expressions applied; some issues remain
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Shouldn't pg_settings.enumvals be array of text?