Re: CTE vs Subquery

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: CTE vs Subquery
Дата
Msg-id CAHyXU0z7qj=6G-B6aJ+-D0+kQtV7rBg3-J5=iFuhQEg_0WmoEA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: CTE vs Subquery  (Linos <info@linos.es>)
Ответы Re: CTE vs Subquery  (Linos <info@linos.es>)
Список pgsql-performance
On Wed, Oct 26, 2011 at 4:00 AM, Linos <info@linos.es> wrote:
> El 25/10/11 19:11, Merlin Moncure escribió:
>> On Tue, Oct 25, 2011 at 11:47 AM, Linos <info@linos.es> wrote:
>>> El 25/10/11 18:43, Tom Lane escribió:
>>>> Linos <info@linos.es> writes:
>>>>>     i am having any problems with performance of queries that uses CTE, can the
>>>>> join on a CTE use the index of the original table?
>>>>
>>>> CTEs act as optimization fences.  This is a feature, not a bug.  Use
>>>> them when you want to isolate the evaluation of a subquery.
>>>>
>>>>                       regards, tom lane
>>>>
>>>
>>> The truth it is that complex queries seems more readable using them (maybe a
>>> personal preference no doubt).
>>>
>>> Do have other popular databases the same behavior? SQL Server or Oracle for example?
>>
>> In my experience, SQL server also materializes them -- basically CTE
>> is short hand for 'CREATE TEMP TABLE foo AS SELECT...' then joining to
>> foo.  If you want join behavior, use a join (by the way IIRC SQL
>> Server is a lot more restrictive about placement of ORDER BY).
>>
>> I like CTE current behavior -- the main place I find it awkward is in
>> use of recursive queries because the CTE fence forces me to abstract
>> the recursion behind a function, not a view since pushing the view
>> qual down into the CTE is pretty horrible:
>>
>> postgres=# explain select foo.id, (with bar as (select id from foo f
>> where f.id = foo.id) select * from bar) from foo where foo.id = 11;
>>                                      QUERY PLAN
>> -------------------------------------------------------------------------------------
>>  Index Scan using foo_idx on foo  (cost=0.00..16.57 rows=1 width=4)
>>    Index Cond: (id = 11)
>>    SubPlan 2
>>      ->  CTE Scan on bar  (cost=8.28..8.30 rows=1 width=4)
>>            CTE bar
>>              ->  Index Scan using foo_idx on foo f  (cost=0.00..8.28
>> rows=1 width=4)
>>                    Index Cond: (id = $0)
>> (7 rows)
>>
>> whereas for function you can inject your qual inside the CTE pretty
>> easily.  this is a different problem than the one you're describing
>> though.  for the most part, CTE execution fence is a very good thing,
>> since it enforces restrictions that other features can leverage, for
>> example 'data modifying with' queries (by far my all time favorite
>> postgres enhancement).
>>
>> merlin
>>
>
> ok, i get the idea, but i still don't understand what Tom says about isolate
> evaluation, apart from the performance and the readability, if i am not using
> writable CTE or recursive CTE, what it is the difference in evaluation (about
> being isolate) of a subquery vs CTE with the same text inside.
>
> I have been using this form lately:
>
> WITH inv (SELECT item_id,
>                               SUM(units) AS units
>                 FROM invoices),
>
> quo AS (SELECT item_id,
>                             SUM(units) AS units
>              FROM quotes)
>
> SELECT items.item_id,
>              CASE WHEN inv.units IS NOT NULL THEN inv.units ELSE 0 END AS
> units_invoices,
>              CASE WHEN quo.units IS NOT NULL THEN quo.units ELSE 0 END AS
> units_quotes
>
> FROM items
>    LEFT JOIN inv ON inv.item_id = items.item_id
>    LEFT JOIN quo ON quo.item_id = items.item_id
>
> Well this is oversimplified because i use much more tables and filter based on
> dates, but you get the idea, it seems that this type of query should use
> subqueries, no?

Think about a query like this:
with foo as
(
  select id, volatile_func() from bar
)
select * from baz join foo using (id) join bla using(id) limit 10;

How many times does volatile_func() get called?  How many times in the
JOIN version?  The answers are different...

One of the key features of CTEs is controlling how/when query
operations occur so you can do things like control side effects and
force query plans that the server would not otherwise choose (although
this is typically an unoptimization).

merlin

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Jens Reufsteck"
Дата:
Сообщение: Anti join miscalculates row number?
Следующее
От: "Cezariusz Marek"
Дата:
Сообщение: Slow cursor