Обсуждение: CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

Поиск
Список
Период
Сортировка

CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

От
"Regina Obe"
Дата:
The CTE change in PostgreSQL 12 broke several of PostGIS regression tests
because many of our tests are negative tests that test to confirm we get
warnings in certain cases.  In the past, these would output 1 notice because
the CTE was materialized, now they output 1 for each column.

An example is as follows:

WITH data AS ( SELECT '#2911' l, ST_Metadata(ST_Rescale(  ST_AddBand(
ST_MakeEmptyRaster(10, 10, 0, 0, 1, -1, 0, 0, 0),   1, '8BUI', 0, 0  ),
2.0,  -2.0  )) m ) SELECT l, (m).* FROM data;

In prior versions this raster test would return one notice:

NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API

Now it returns 10 notices because the call is being done 10 times (1 for
each column)

NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API
NOTICE:  Raster has default geotransform. Adjusting metadata for use of GDAL
Warp API

The regression errors are easy enough to fix with OFFSET or subquery.  What
I'm more concerned about is that I expect we'll have performance
degradation.

Historically PostGIS functions haven't been costed right and can't be
because they rely on INLINING of sql functions which gets broken when too
high of cost is put on functions.  We have a ton of functions like these
that return composite objects and this above function is particularly
expensive so to have it call that 10 times is almost guaranteed to be a
performance killer.

I know there is a new MATERIALIZED keyword to get the old behavior, but
people are not going to be able to change their apps to introduce new
keywords, especially ones meant to be deployed by many versions of
PostgreSQL.

That said IS THERE or can there be a GUC  like  

set cte_materialized = on;

to get the old behavior?

Thanks,
Regina
PostGIS PSC member




Re: CTE Changes in PostgreSQL 12, can we have a GUC to get oldbehavior

От
Andres Freund
Дата:
Hi,

On 2019-02-22 15:33:08 -0500, Regina Obe wrote:
> The CTE change in PostgreSQL 12 broke several of PostGIS regression tests
> because many of our tests are negative tests that test to confirm we get
> warnings in certain cases.  In the past, these would output 1 notice because
> the CTE was materialized, now they output 1 for each column.
> 
> An example is as follows:
> 
> WITH data AS ( SELECT '#2911' l, ST_Metadata(ST_Rescale(  ST_AddBand(
> ST_MakeEmptyRaster(10, 10, 0, 0, 1, -1, 0, 0, 0),   1, '8BUI', 0, 0  ),
> 2.0,  -2.0  )) m ) SELECT l, (m).* FROM data;

> The regression errors are easy enough to fix with OFFSET or subquery.  What
> I'm more concerned about is that I expect we'll have performance
> degradation.
> 
> Historically PostGIS functions haven't been costed right and can't be
> because they rely on INLINING of sql functions which gets broken when too
> high of cost is put on functions.  We have a ton of functions like these
> that return composite objects and this above function is particularly
> expensive so to have it call that 10 times is almost guaranteed to be a
> performance killer.

I think there's a fair argument that we shouldn't inline in a way that
increases the number of function calls due to (foo).*. In fact, I'm
mildly surprised that we do that?


> That said IS THERE or can there be a GUC  like  
> 
> set cte_materialized = on;
> 
> to get the old behavior?

-incredibly many.  That'll just make it harder to understand what SQL
means.

Greetings,

Andres Freund


Re: CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

От
Robert Haas
Дата:
On Fri, Feb 22, 2019 at 3:33 PM Regina Obe <lr@pcorp.us> wrote:
> Historically PostGIS functions haven't been costed right and can't be
> because they rely on INLINING of sql functions which gets broken when too
> high of cost is put on functions.  We have a ton of functions like these
> that return composite objects and this above function is particularly
> expensive so to have it call that 10 times is almost guaranteed to be a
> performance killer.

This is good evidence that using the cost to decide on inlining is a
terrible idea and should be changed.

> I know there is a new MATERIALIZED keyword to get the old behavior, but
> people are not going to be able to change their apps to introduce new
> keywords, especially ones meant to be deployed by many versions of
> PostgreSQL.
>
> That said IS THERE or can there be a GUC  like
>
> set cte_materialized = on;
>
> to get the old behavior?

Behavior changing GUCs *really* suck.  If we add such a GUC, it will
affect not only PostGIS but everything run on the server -- and we
made this change because we believe it's going to improve things
overall.  I'm really reluctant to believe that it's right to encourage
people to go back in the opposite direction, especially because it
means there will be no consistency from one PostgreSQL system to the
next.

I think there are probably other ways of fixing this query that won't
have such dramatic effects; it doesn't really seem to need to use
WITH, and I bet you could also tweak the WITH query to prevent
inlining.  I also think Andres's question about why this gets inlined
in the first place is a good one; the (m).* seems like it ought to be
counted as a multiple reference.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2019-02-22 15:33:08 -0500, Regina Obe wrote:
>> The CTE change in PostgreSQL 12 broke several of PostGIS regression tests
>> because many of our tests are negative tests that test to confirm we get
>> warnings in certain cases.  In the past, these would output 1 notice because
>> the CTE was materialized, now they output 1 for each column.
>> An example is as follows:
>> WITH data AS ( SELECT '#2911' l, ST_Metadata(ST_Rescale(  ST_AddBand(
>> ST_MakeEmptyRaster(10, 10, 0, 0, 1, -1, 0, 0, 0),   1, '8BUI', 0, 0  ),
>> 2.0,  -2.0  )) m ) SELECT l, (m).* FROM data;

> I think there's a fair argument that we shouldn't inline in a way that
> increases the number of function calls due to (foo).*. In fact, I'm
> mildly surprised that we do that?

Well, is (foo).* all that much different from multiple written-out calls
to the function?  But yeah, I'd rather try to address this complaint by
tweaking the inlining rules.  It would certainly help if the function
had a realistic (high) cost estimate, because if it pretends to be cheap,
we really *should* be willing to inline it, one would think.

>> That said IS THERE or can there be a GUC  like
>> set cte_materialized = on;
>> to get the old behavior?

> -incredibly many.  That'll just make it harder to understand what SQL
> means.

Agreed.  Now that we bit that bullet, I don't want to make things even
messier with a GUC.

The thing to do if you want to force backwards-compatible behavior without
using new-in-v12 syntax is to insert good ol' OFFSET 0 in the CTE query.

            regards, tom lane


RE: CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

От
"Regina Obe"
Дата:
> I think there are probably other ways of fixing this query that won't have
> such dramatic effects; it doesn't really seem to need to use WITH, and I bet
> you could also tweak the WITH query to prevent inlining.

Yes I know I can change THIS QUERY.  I've changed other ones to work around this.
Normally I just use a LATERAL for this.

My point is lots of people use CTEs intentionally for this kind of thing because they know they are materialized.

It's going to make a lot of people hesitant to upgrade if they think they need to revisit every CTE (that they
intentionallywrote cause they thought it would be materialized) to throw in a MATERIALIZED keyword. 

> I also think
> Andres's question about why this gets inlined in the first place is a good one;
> the (m).* seems like it ought to be counted as a multiple reference.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company
Well if we can at least prevent the multiple reference thing from inlining that might be good enough to solve most
performanceregression issues that arise. 

Thanks,
Regina






Re: CTE Changes in PostgreSQL 12, can we have a GUC to get old behavior

От
Robert Haas
Дата:
On Fri, Feb 22, 2019 at 4:27 PM Regina Obe <lr@pcorp.us> wrote:
> It's going to make a lot of people hesitant to upgrade if they think they need to revisit every CTE (that they
intentionallywrote cause they thought it would be materialized) to throw in a MATERIALIZED keyword.
 

You might be right, but I hope you're wrong, because if you're right,
then it means that the recent change was ill-considered and ought to
be reverted.  And I think there are a lot of people who are REALLY
looking forward to that behavior change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: CTE Changes in PostgreSQL 12, can we have a GUC to get oldbehavior

От
Andres Freund
Дата:
Hi,

On 2019-02-22 16:27:28 -0500, Regina Obe wrote:
> > I think there are probably other ways of fixing this query that won't have
> > such dramatic effects; it doesn't really seem to need to use WITH, and I bet
> > you could also tweak the WITH query to prevent inlining.  
> 
> Yes I know I can change THIS QUERY.  I've changed other ones to work around this.
> Normally I just use a LATERAL for this.
> 
> My point is lots of people use CTEs intentionally for this kind of thing because they know they are materialized.
> 
> It's going to make a lot of people hesitant to upgrade if they think they need to revisit every CTE (that they
intentionallywrote cause they thought it would be materialized) to throw in a MATERIALIZED keyword.
 

This was extensively discussed, in several threads about inlining
CTEs. But realistically, it doesn't actually solve the problem to offer
a GUC, because we'd not be able to remove it anytime soon.  I see
benefit in discussing how we can address regressions like your example
query here, but I don't feel there's any benefit in re-opening the whole
discussion about GUCs, defaults, and whatnot.

Greetings,

Andres Freund