Обсуждение: Poor performance using CTE

Поиск
Список
Период
Сортировка

Poor performance using CTE

От
David Greco
Дата:

Have a query using a CTE that is performing very poorly. The equivalent query against the same data in an Oracle database runs in under 1 second, in Postgres  it takes 2000 seconds.

 

The smp_pkg.get_invoice_charges queries fedexinvoices for some data and normalizes it into a SETOF some record type. It is declared to be STABLE. Fedexinvoices consists of about 1.3M rows of medium width. Fedexinvoices.id is the primary key on that table, and trim(fedexinvoices.trackno) is indexed via the function trim.

 

The plan for the equivalent query in Oracle is much smaller and simpler. No sequential (or full table) scans on fedexinvoices.

 

 

 

WITH charges as (

                SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id) charge_info from fedexinvoices fi2

)

select fedexinvoices.* from

fedexinvoices

inner join charges on charges.id = fedexinvoices.id AND (charges.charge_info).charge_name IN ('ADDRESS CORRECTION CHARGE','ADDRESS CORRECTION')

where

trim(fedexinvoices.trackno)='799159791643'

;

 

Explain Analyze output, I abbreviated some of the column lists for brevity:

 

Nested Loop  (cost=457380.38..487940.77 rows=1 width=1024) (actual time=1978019.858..1978019.858 rows=0 loops=1)

  Output: fedexinvoices.id, ………

  Join Filter: (fedexinvoices.id = charges.id)

  Buffers: shared hit=20387611, temp written=94071

  CTE charges

    ->  Seq Scan on hits.fedexinvoices fi2  (cost=0.00..457380.38 rows=1350513 width=8) (actual time=0.613..1964632.763 rows=9007863 loops=1)

          Output: fi2.id, smp_pkg.get_invoice_charges(fi2.id, NULL::character varying)

          Buffers: shared hit=20387606

  ->  Index Scan using fedexinvoices_trim_track_idx on hits.fedexinvoices  (cost=0.00..5.46 rows=1 width=1024) (actual time=0.024..0.026 rows=1 loops=1)

        Output: fedexinvoices.id, ………

        Index Cond: (btrim((fedexinvoices.trackno)::text) = '799159791643'::text)

        Buffers: shared hit=5

  ->  CTE Scan on charges  (cost=0.00..30386.54 rows=13471 width=8) (actual time=1978019.827..1978019.827 rows=0 loops=1)

        Output: charges.id, charges.charge_info

        Filter: (((charges.charge_info).charge_name)::text = ANY ('{"ADDRESS CORRECTION CHARGE","ADDRESS CORRECTION"}'::text[]))

        Buffers: shared hit=20387606, temp written=94071

Total runtime: 1978214.743 ms

 

 

Re: Poor performance using CTE

От
Merlin Moncure
Дата:
On Tue, Nov 13, 2012 at 2:57 PM, David Greco
<David_Greco@harte-hanks.com> wrote:
> Have a query using a CTE that is performing very poorly. The equivalent
> query against the same data in an Oracle database runs in under 1 second, in
> Postgres  it takes 2000 seconds.
>
>
>
> The smp_pkg.get_invoice_charges queries fedexinvoices for some data and
> normalizes it into a SETOF some record type. It is declared to be STABLE.
> Fedexinvoices consists of about 1.3M rows of medium width. Fedexinvoices.id
> is the primary key on that table, and trim(fedexinvoices.trackno) is indexed
> via the function trim.
>
>
>
> The plan for the equivalent query in Oracle is much smaller and simpler. No
> sequential (or full table) scans on fedexinvoices.
>
>
>
>
>
>
>
> WITH charges as (
>
>                 SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id)
> charge_info from fedexinvoices fi2
>
> )
>
> select fedexinvoices.* from
>
> fedexinvoices
>
> inner join charges on charges.id = fedexinvoices.id AND
> (charges.charge_info).charge_name IN ('ADDRESS CORRECTION CHARGE','ADDRESS
> CORRECTION')
>
> where
>
> trim(fedexinvoices.trackno)='799159791643'
>
> ;
>
>
>
> Explain Analyze output, I abbreviated some of the column lists for brevity:
>
>
>
> Nested Loop  (cost=457380.38..487940.77 rows=1 width=1024) (actual
> time=1978019.858..1978019.858 rows=0 loops=1)
>
>   Output: fedexinvoices.id, ………
>
>   Join Filter: (fedexinvoices.id = charges.id)
>
>   Buffers: shared hit=20387611, temp written=94071
>
>   CTE charges
>
>     ->  Seq Scan on hits.fedexinvoices fi2  (cost=0.00..457380.38
> rows=1350513 width=8) (actual time=0.613..1964632.763 rows=9007863 loops=1)
>
>           Output: fi2.id, smp_pkg.get_invoice_charges(fi2.id,
> NULL::character varying)
>
>           Buffers: shared hit=20387606
>
>   ->  Index Scan using fedexinvoices_trim_track_idx on hits.fedexinvoices
> (cost=0.00..5.46 rows=1 width=1024) (actual time=0.024..0.026 rows=1
> loops=1)
>
>         Output: fedexinvoices.id, ………
>
>         Index Cond: (btrim((fedexinvoices.trackno)::text) =
> '799159791643'::text)
>
>         Buffers: shared hit=5
>
>   ->  CTE Scan on charges  (cost=0.00..30386.54 rows=13471 width=8) (actual
> time=1978019.827..1978019.827 rows=0 loops=1)
>
>         Output: charges.id, charges.charge_info
>
>         Filter: (((charges.charge_info).charge_name)::text = ANY ('{"ADDRESS
> CORRECTION CHARGE","ADDRESS CORRECTION"}'::text[]))
>
>         Buffers: shared hit=20387606, temp written=94071
>
> Total runtime: 1978214.743 ms

The problem here is very clear.  Oracle is optimizing through the CTE.
 PostgreSQL does not do this by design -- CTE's are used as a forced
materialization step.

merlin


Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> The problem here is very clear.  Oracle is optimizing through the CTE.
>  PostgreSQL does not do this by design -- CTE's are used as a forced
> materialization step.

While I love that design (it lets me solve lots of problems for huge
queries), wouldn't pushing constraints into the CTE be a rather safe
optimization?


Re: Poor performance using CTE

От
Merlin Moncure
Дата:
On Tue, Nov 20, 2012 at 9:10 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> The problem here is very clear.  Oracle is optimizing through the CTE.
>>  PostgreSQL does not do this by design -- CTE's are used as a forced
>> materialization step.
>
> While I love that design (it lets me solve lots of problems for huge
> queries), wouldn't pushing constraints into the CTE be a rather safe
> optimization?

sure, or rewrite query as classic join.

merlin


Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Tue, Nov 20, 2012 at 12:23 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Nov 20, 2012 at 9:10 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>> The problem here is very clear.  Oracle is optimizing through the CTE.
>>>  PostgreSQL does not do this by design -- CTE's are used as a forced
>>> materialization step.
>>
>> While I love that design (it lets me solve lots of problems for huge
>> queries), wouldn't pushing constraints into the CTE be a rather safe
>> optimization?
>
> sure, or rewrite query as classic join.

I meant for postgres to do automatically. Rewriting as a join wouldn't
work as an optimization fence the way we're used to, but pushing
constraints upwards can only help (especially if highly selective).


Re: Poor performance using CTE

От
Craig Ringer
Дата:
On 11/21/2012 12:06 AM, Claudio Freire wrote:
> I meant for postgres to do automatically. Rewriting as a join wouldn't
> work as an optimization fence the way we're used to, but pushing
> constraints upwards can only help (especially if highly selective).
Because people are now used to using CTEs as query hints, it'd probably
cause performance regressions in working queries. Perhaps more
importantly, Pg would have to prove that doing so didn't change queries
that invoked functions with side-effects to avoid changing the results
of currently valid queries.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Tue, Nov 20, 2012 at 8:38 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 11/21/2012 12:06 AM, Claudio Freire wrote:
>> I meant for postgres to do automatically. Rewriting as a join wouldn't
>> work as an optimization fence the way we're used to, but pushing
>> constraints upwards can only help (especially if highly selective).
> Because people are now used to using CTEs as query hints, it'd probably
> cause performance regressions in working queries. Perhaps more
> importantly, Pg would have to prove that doing so didn't change queries
> that invoked functions with side-effects to avoid changing the results
> of currently valid queries.

Fair point. Will look into it a bit.


Re: Poor performance using CTE

От
Tom Lane
Дата:
Craig Ringer <craig@2ndQuadrant.com> writes:
> On 11/21/2012 12:06 AM, Claudio Freire wrote:
>> I meant for postgres to do automatically. Rewriting as a join wouldn't
>> work as an optimization fence the way we're used to, but pushing
>> constraints upwards can only help (especially if highly selective).

> Because people are now used to using CTEs as query hints, it'd probably
> cause performance regressions in working queries. Perhaps more
> importantly, Pg would have to prove that doing so didn't change queries
> that invoked functions with side-effects to avoid changing the results
> of currently valid queries.

We could trivially arrange to keep the current semantics if the CTE
query contains any volatile functions (or of course if it's
INSERT/UPDATE/DELETE).  I think we'd also need to not optimize if
it's invoked from more than one place in the outer query.

I think the more interesting question is what cases wouldn't be covered
by such a rule.  Typically you need to use OFFSET 0 in situations where
the planner has guessed wrong about costs or rowcounts, and I think
people are likely using WITH for that as well.  Should we be telling
people that they ought to insert OFFSET 0 in WITH queries if they want
to be sure there's an optimization fence?

            regards, tom lane


Re: Poor performance using CTE

От
Heikki Linnakangas
Дата:
On 21.11.2012 01:53, Tom Lane wrote:
> I think the more interesting question is what cases wouldn't be covered
> by such a rule.  Typically you need to use OFFSET 0 in situations where
> the planner has guessed wrong about costs or rowcounts, and I think
> people are likely using WITH for that as well.  Should we be telling
> people that they ought to insert OFFSET 0 in WITH queries if they want
> to be sure there's an optimization fence?

Yes, I strongly feel that we should. Writing a query using WITH often
makes it more readable. It would be a shame if people have to refrain
from using it, because the planner treats it as an optimization fence.

- Heikki


Re: Poor performance using CTE

От
Peter Geoghegan
Дата:
On 21 November 2012 13:04, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> Yes, I strongly feel that we should. Writing a query using WITH often makes
> it more readable. It would be a shame if people have to refrain from using
> it, because the planner treats it as an optimization fence.

+1

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


Re: Poor performance using CTE

От
Andrew Dunstan
Дата:
On 11/21/2012 08:04 AM, Heikki Linnakangas wrote:
> On 21.11.2012 01:53, Tom Lane wrote:
>> I think the more interesting question is what cases wouldn't be covered
>> by such a rule.  Typically you need to use OFFSET 0 in situations where
>> the planner has guessed wrong about costs or rowcounts, and I think
>> people are likely using WITH for that as well.  Should we be telling
>> people that they ought to insert OFFSET 0 in WITH queries if they want
>> to be sure there's an optimization fence?
>
> Yes, I strongly feel that we should. Writing a query using WITH often
> makes it more readable. It would be a shame if people have to refrain
> from using it, because the planner treats it as an optimization fence.
>
>



If we're going to do it can we please come up with something more
intuitive and much, much more documented than "OFFSET 0"? And if/when we
do this we'll need to have big red warnings all over then release notes,
since a lot of people I know will need to do some extensive remediation
before moving to such a release.

cheers

andrew



Re: Poor performance using CTE

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> If we're going to do it can we please come up with something more
> intuitive and much, much more documented than "OFFSET 0"? And if/when we
> do this we'll need to have big red warnings all over then release notes,
> since a lot of people I know will need to do some extensive remediation
> before moving to such a release.

The probability that we would actually *remove* that behavior of OFFSET
0 is not distinguishable from zero.  I'm not terribly excited about
having an alternate syntax to specify an optimization fence, but even
if we do create such a thing, there's no need to break the historical
usage.

            regards, tom lane


Re: Poor performance using CTE

От
Andrew Dunstan
Дата:
On 11/21/2012 09:59 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> If we're going to do it can we please come up with something more
>> intuitive and much, much more documented than "OFFSET 0"? And if/when we
>> do this we'll need to have big red warnings all over then release notes,
>> since a lot of people I know will need to do some extensive remediation
>> before moving to such a release.
> The probability that we would actually *remove* that behavior of OFFSET
> 0 is not distinguishable from zero.  I'm not terribly excited about
> having an alternate syntax to specify an optimization fence, but even
> if we do create such a thing, there's no need to break the historical
> usage.
>

I wasn't talking about removing it. My point was that if the
optimization fence around CTEs is removed a lot of people will need to
rework apps where they have used them for that purpose. And I continue
to think that spelling it "OFFSET 0" is horribly obscure.

cheers

andrew


Re: Poor performance using CTE

От
Andres Freund
Дата:
On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
>
> On 11/21/2012 09:59 AM, Tom Lane wrote:
> >Andrew Dunstan <andrew@dunslane.net> writes:
> >>If we're going to do it can we please come up with something more
> >>intuitive and much, much more documented than "OFFSET 0"? And if/when we
> >>do this we'll need to have big red warnings all over then release notes,
> >>since a lot of people I know will need to do some extensive remediation
> >>before moving to such a release.
> >The probability that we would actually *remove* that behavior of OFFSET
> >0 is not distinguishable from zero.  I'm not terribly excited about
> >having an alternate syntax to specify an optimization fence, but even
> >if we do create such a thing, there's no need to break the historical
> >usage.
> >
>
> I wasn't talking about removing it. My point was that if the optimization
> fence around CTEs is removed a lot of people will need to rework apps where
> they have used them for that purpose. And I continue to think that spelling
> it "OFFSET 0" is horribly obscure.

+1

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

Greetings,

Andres Freund


Re: Poor performance using CTE

От
Peter Geoghegan
Дата:
On 21 November 2012 15:21, Andrew Dunstan <andrew@dunslane.net> wrote:
> And I continue to think that spelling it "OFFSET 0" is horribly obscure.

I'm not sure that it's any more obscure than the very idea of an
optimisation fence.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


Re: Poor performance using CTE

От
Heikki Linnakangas
Дата:
On 21.11.2012 17:42, Gavin Flower wrote:
> On 22/11/12 04:32, Andres Freund wrote:
>> On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
>>> I wasn't talking about removing it. My point was that if the
>>> optimization
>>> fence around CTEs is removed a lot of people will need to rework apps
>>> where
>>> they have used them for that purpose. And I continue to think that
>>> spelling
>>> it "OFFSET 0" is horribly obscure.
>> +1

FWIW, I'm happy with "OFFSET 0". Granted, it's pretty obscure, but
that's what we've historically recommended, and it's pretty ugly to have
to specify a fence like that in the first place. Whenever you have to
resort to it, you ought have a comment in the query explaining why you
need to force the planner like that, anyway.

>> WITH foo AS (SELECT ...) (barrier=on|off)?
>>
>> 9.3 introduces the syntax, defaulting to on
>> 9.4 switches the default to off.
>
> WITH foo AS (SELECT ...) (fence=on|off)?
>
> WITH foo AS (SELECT ...) (optimisation_fence=on|off)?

If we are to invent a new syntax for this, can we please come up with
something that's more widely applicable than just the WITH syntax.
Something that you could use to replace OFFSET 0 in a subquery, too.

- Heikki


Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund <andres@anarazel.de> wrote:
> +1
>
> WITH foo AS (SELECT ...) (barrier=on|off)?
>
> 9.3 introduces the syntax, defaulting to on
> 9.4 switches the default to off.

Why syntax? What about a guc?

collapse_cte_limit?


Re: Poor performance using CTE

От
Andres Freund
Дата:
On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
> On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund <andres@anarazel.de> wrote:
> > +1
> >
> > WITH foo AS (SELECT ...) (barrier=on|off)?
> >
> > 9.3 introduces the syntax, defaulting to on
> > 9.4 switches the default to off.
>
> Why syntax? What about a guc?
>
> collapse_cte_limit?

Because there are very good reasons to want to current behaviour. A guc
is a global either/or so I don't see it helping much.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
>> On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund <andres@anarazel.de> wrote:
>> > +1
>> >
>> > WITH foo AS (SELECT ...) (barrier=on|off)?
>> >
>> > 9.3 introduces the syntax, defaulting to on
>> > 9.4 switches the default to off.
>>
>> Why syntax? What about a guc?
>>
>> collapse_cte_limit?
>
> Because there are very good reasons to want to current behaviour. A guc
> is a global either/or so I don't see it helping much.

set collapse_cte_limit=8;
with blah as (blah) select blah;

Not global at all.


Re: Poor performance using CTE

От
Andres Freund
Дата:
On 2012-11-21 13:32:45 -0300, Claudio Freire wrote:
> On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
> >> On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund <andres@anarazel.de> wrote:
> >> > +1
> >> >
> >> > WITH foo AS (SELECT ...) (barrier=on|off)?
> >> >
> >> > 9.3 introduces the syntax, defaulting to on
> >> > 9.4 switches the default to off.
> >>
> >> Why syntax? What about a guc?
> >>
> >> collapse_cte_limit?
> >
> > Because there are very good reasons to want to current behaviour. A guc
> > is a global either/or so I don't see it helping much.
>
> set collapse_cte_limit=8;
> with blah as (blah) select blah;
>
> Not global at all.

Not very manageable though. And it doesn't help if you need both in a
query which isn't actually that unlikely.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Poor performance using CTE

От
Andrew Dunstan
Дата:
On 11/21/2012 11:32 AM, Claudio Freire wrote:
> On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
>>> On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund <andres@anarazel.de> wrote:
>>>> +1
>>>>
>>>> WITH foo AS (SELECT ...) (barrier=on|off)?
>>>>
>>>> 9.3 introduces the syntax, defaulting to on
>>>> 9.4 switches the default to off.
>>> Why syntax? What about a guc?
>>>
>>> collapse_cte_limit?
>> Because there are very good reasons to want to current behaviour. A guc
>> is a global either/or so I don't see it helping much.
> set collapse_cte_limit=8;
> with blah as (blah) select blah;
>
> Not global at all.
>

Then you have to unset it again, which is ugly. You might even want it
applying to *part* of a query, not the whole thing, so this strikes me
as a dead end.

cheers

andrew


Re: Poor performance using CTE

От
Claudio Freire
Дата:
On Wed, Nov 21, 2012 at 1:35 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>>> Why syntax? What about a guc?
>>>>
>>>> collapse_cte_limit?
>>>
>>> Because there are very good reasons to want to current behaviour. A guc
>>> is a global either/or so I don't see it helping much.
>>
>> set collapse_cte_limit=8;
>> with blah as (blah) select blah;
>>
>> Not global at all.
>>
>
> Then you have to unset it again, which is ugly. You might even want it
> applying to *part* of a query, not the whole thing, so this strikes me as a
> dead end.

Really?

Because I've seen here people that want it generally (because
Oracle/MSSQL/your favourite db does it), and people that don't want it
(generally because they need it). I haven't seen any mention to mixing
fenced and unfenced usage.


Re: Poor performance using CTE

От
Tom Lane
Дата:
Claudio Freire <klaussfreire@gmail.com> writes:
> collapse_cte_limit?

The join collapse limits address a completely different problem (ie,
explosion of planning time with too many relations), and are pretty much
useless as a model for this.  As multiple people told you already,
optimization fences are typically wanted for only specific subqueries.

            regards, tom lane


Re: Poor performance using CTE

От
Andrew Dunstan
Дата:
On 11/21/2012 02:30 PM, Gavin Flower wrote:
> WITH FENCE foo AS (SELECT ...)
> default?
>
>
> WITHOUT FENCE foo AS (SELECT ...)     :-)
> Nah!
> I prefer this, but it is too specific to 'WITH',
> and very unSQL standardish!
>
> Alternatively one of the following
>
>  1. WITH UNFENCED foo AS (SELECT ...)
>  2. WITH NO FENCE foo AS (SELECT ...)
>  3. WITH NOT FENCE foo AS (SELECT ...)
>
> I loke the firsat variant, but the 3rd is
> most SQL standardish!
>

As Tom (I think) pointed out, we should not have a syntax tied to CTEs.

cheers

andrew



Re: Poor performance using CTE

От
Craig Ringer
Дата:
On 11/22/2012 03:30 AM, Gavin Flower wrote:
On 22/11/12 04:56, Heikki Linnakangas wrote:
On 21.11.2012 17:42, Gavin Flower wrote:
On 22/11/12 04:32, Andres Freund wrote:
On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
I wasn't talking about removing it. My point was that if the
optimization
fence around CTEs is removed a lot of people will need to rework apps
where
they have used them for that purpose. And I continue to think that
spelling
it "OFFSET 0" is horribly obscure.
+1

FWIW, I'm happy with "OFFSET 0". Granted, it's pretty obscure, but that's what we've historically recommended, and it's pretty ugly to have to specify a fence like that in the first place. Whenever you have to resort to it, you ought have a comment in the query explaining why you need to force the planner like that, anyway.

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

WITH foo AS (SELECT ...) (fence=on|off)?

WITH foo AS (SELECT ...) (optimisation_fence=on|off)?

If we are to invent a new syntax for this, can we please come up with something that's more widely applicable than just the WITH syntax. Something that you could use to replace OFFSET 0 in a subquery, too.

- Heikki
WITH FENCE foo AS (SELECT ...)
default?
That doesn't bind tightly enough to a specific CTE term. Consider:

WITH
  FENCE foo AS (SELECT ...),
  bar AS (SELECT ...)
SELECT * FROM bar;

Are we fencing just foo? Or all expressions?


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Poor performance using CTE

От
Craig Ringer
Дата:
On 11/22/2012 08:38 AM, Gavin Flower wrote:
> I suspect most people are blissfully unaware of CTE's being fenced, or
> at least not really sure what it means. So I suspect NOT FENCE would
> be the better default.
It's also probably more standard, and a better fit with what other DBs do.

Pg would still need to detect conditions like the use of functions with
side effects or (obviously) INSERT/UPDATE/DELETE wCTEs and not push
conditions down into them / pull conditions up from them, etc. That's
how I read the standard, though; must have the same effect as if the
queries were executed as written, so Pg is free to transform them so
long as it doesn't change the results.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Poor performance using CTE

От
Jeremy Harris
Дата:
On 22/11/2012 00:08, Craig Ringer wrote:
> WITH
>    FENCE foo AS (SELECT ...),
>    bar AS (SELECT ...)
> SELECT * FROM bar;
>
> Are we fencing just foo? Or all expressions?
>

WITH foo AS (FENCED SELECT ...),
      bar AS (SELECT ...),
SELECT ... ;

--
Jeremy




Re: Poor performance using CTE

От
Jon Nelson
Дата:
On Thu, Nov 22, 2012 at 7:42 AM, Jeremy Harris <jgh@wizmail.org> wrote:
> On 22/11/2012 00:08, Craig Ringer wrote:
>>
>> WITH
>>    FENCE foo AS (SELECT ...),
>>    bar AS (SELECT ...)
>> SELECT * FROM bar;
>>
>> Are we fencing just foo? Or all expressions?
>>
>
> WITH foo AS (FENCED SELECT ...),
>      bar AS (SELECT ...),
> SELECT ... ;

I would much rather see 'MATERIALIZE' instead of 'FENCED', unless the
by the latter you mean to forbid *all* optimizations, whereas with the
latter the meaning is pretty clear.

--
Jon


Re: Poor performance using CTE

От
Gavin Flower
Дата:
On 22/11/12 04:32, Andres Freund wrote:
> On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
>> On 11/21/2012 09:59 AM, Tom Lane wrote:
>>> Andrew Dunstan <andrew@dunslane.net> writes:
>>>> If we're going to do it can we please come up with something more
>>>> intuitive and much, much more documented than "OFFSET 0"? And if/when we
>>>> do this we'll need to have big red warnings all over then release notes,
>>>> since a lot of people I know will need to do some extensive remediation
>>>> before moving to such a release.
>>> The probability that we would actually *remove* that behavior of OFFSET
>>> 0 is not distinguishable from zero.  I'm not terribly excited about
>>> having an alternate syntax to specify an optimization fence, but even
>>> if we do create such a thing, there's no need to break the historical
>>> usage.
>>>
>> I wasn't talking about removing it. My point was that if the optimization
>> fence around CTEs is removed a lot of people will need to rework apps where
>> they have used them for that purpose. And I continue to think that spelling
>> it "OFFSET 0" is horribly obscure.
> +1
>
> WITH foo AS (SELECT ...) (barrier=on|off)?
>
> 9.3 introduces the syntax, defaulting to on
> 9.4 switches the default to off.
>
> Greetings,
>
> Andres Freund
>
>

WITH foo AS (SELECT ...) (fence=on|off)?


WITH foo AS (SELECT ...) (optimisation_fence=on|off)?




Re: Poor performance using CTE

От
Gavin Flower
Дата:
On 22/11/12 04:56, Heikki Linnakangas wrote:
On 21.11.2012 17:42, Gavin Flower wrote:
On 22/11/12 04:32, Andres Freund wrote:
On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
I wasn't talking about removing it. My point was that if the
optimization
fence around CTEs is removed a lot of people will need to rework apps
where
they have used them for that purpose. And I continue to think that
spelling
it "OFFSET 0" is horribly obscure.
+1

FWIW, I'm happy with "OFFSET 0". Granted, it's pretty obscure, but that's what we've historically recommended, and it's pretty ugly to have to specify a fence like that in the first place. Whenever you have to resort to it, you ought have a comment in the query explaining why you need to force the planner like that, anyway.

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

WITH foo AS (SELECT ...) (fence=on|off)?

WITH foo AS (SELECT ...) (optimisation_fence=on|off)?

If we are to invent a new syntax for this, can we please come up with something that's more widely applicable than just the WITH syntax. Something that you could use to replace OFFSET 0 in a subquery, too.

- Heikki
WITH FENCE foo AS (SELECT ...)
default?


WITHOUT FENCE foo AS (SELECT ...)     :-)
Nah!
I prefer this, but it is too specific to 'WITH',
and very unSQL standardish!

Alternatively one of the following
  1. WITH UNFENCED foo AS (SELECT ...)
  2. WITH NO FENCE foo AS (SELECT ...)
  3. WITH NOT FENCE foo AS (SELECT ...)
I loke the firsat variant, but the 3rd is
most SQL standardish!

Cheers,
Gavin

Re: Poor performance using CTE

От
Gavin Flower
Дата:
On 22/11/12 08:42, Andrew Dunstan wrote:
>
> On 11/21/2012 02:30 PM, Gavin Flower wrote:
>> WITH FENCE foo AS (SELECT ...)
>> default?
>>
>>
>> WITHOUT FENCE foo AS (SELECT ...)     :-)
>> Nah!
>> I prefer this, but it is too specific to 'WITH',
>> and very unSQL standardish!
>>
>> Alternatively one of the following
>>
>>  1. WITH UNFENCED foo AS (SELECT ...)
>>  2. WITH NO FENCE foo AS (SELECT ...)
>>  3. WITH NOT FENCE foo AS (SELECT ...)
>>
>> I loke the firsat variant, but the 3rd is
>> most SQL standardish!
>>
>
> As Tom (I think) pointed out, we should not have a syntax tied to CTEs.
>
> cheers
>
> andrew
>
If other SQL constructs have a optimisation fence, then the FENCE & NOT
FENCE syntax can be used theire as well.

So what am I missing? (obviously WITHOUT FENCE would not make sense
elsewhere, but I wasn't really being serious when I suggested it!)



Re: Poor performance using CTE

От
Gavin Flower
Дата:
On 22/11/12 13:08, Craig Ringer wrote:
On 11/22/2012 03:30 AM, Gavin Flower wrote:
On 22/11/12 04:56, Heikki Linnakangas wrote:
On 21.11.2012 17:42, Gavin Flower wrote:
On 22/11/12 04:32, Andres Freund wrote:
On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
I wasn't talking about removing it. My point was that if the
optimization
fence around CTEs is removed a lot of people will need to rework apps
where
they have used them for that purpose. And I continue to think that
spelling
it "OFFSET 0" is horribly obscure.
+1

FWIW, I'm happy with "OFFSET 0". Granted, it's pretty obscure, but that's what we've historically recommended, and it's pretty ugly to have to specify a fence like that in the first place. Whenever you have to resort to it, you ought have a comment in the query explaining why you need to force the planner like that, anyway.

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

WITH foo AS (SELECT ...) (fence=on|off)?

WITH foo AS (SELECT ...) (optimisation_fence=on|off)?

If we are to invent a new syntax for this, can we please come up with something that's more widely applicable than just the WITH syntax. Something that you could use to replace OFFSET 0 in a subquery, too.

- Heikki
WITH FENCE foo AS (SELECT ...)
default?
That doesn't bind tightly enough to a specific CTE term. Consider:

WITH
  FENCE foo AS (SELECT ...),
  bar AS (SELECT ...)
SELECT * FROM bar;

Are we fencing just foo? Or all expressions?


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Are we fencing or fooing??? :-)

Hmm...
How about:

(a) If we have lots of WITH SELECTS which mostly have one specific type of fencing, then we could specify the common fence value after the WITH and the exceptions after the AS:

WITH FENCE
  foo AS (SELECT ...),
  bar AS NOT FENCE (SELECT ...).
  baz AS (SELECT ...)
SELECT * FROM bar;

alternatively:
WITH NOT FENCE
  foo AS FENCE (SELECT ...),
  bar AS (SELECT ...).
  baz AS FENCE (SELECT ...)
SELECT * FROM bar;

(b) If we retain that FENCE is the default, then it would be simpler just to just allow a FENCE clause after the AS keyword.

WITH
  foo AS (SELECT ...),
  bar AS NOT FENCE (SELECT ...).
  baz AS (SELECT ...)
SELECT * FROM bar;

Obviously even for (a), we have to have one value of the FENCE clause as the default.  Either make the default FENCE, as now - or NOT FENCE if that is seen to be a better default, especially if that is easier for people coming from Oracle.

I suspect most people are blissfully unaware of CTE's being fenced, or at least not really sure what it means. So I suspect NOT FENCE would be the better default.

Alternative spellings might be better such as:
FENCED / NOT FENCED
or
FENCED / UNFENCED


Cheers,
Gavin