Обсуждение: [HACKERS] Inlining functions with "expensive" parameters

Поиск
Список
Период
Сортировка

[HACKERS] Inlining functions with "expensive" parameters

От
Paul Ramsey
Дата:
All,

As we try and make PostGIS more "parallel sensitive" we have been added costs to our functions, so that their relative CPU cost is more accurately reflected in parallel plans. 

This has resulted in an odd side effect: some of our "wrapper" functions stop giving index scans in plans [1]. This is a problem!

An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an index operation (geom1 && geom2) with an exact spatial test (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and has worked for us well for a decade and more. Having this construct stop working is definitely a problem.

As we add costs to our functions, the odds increase that one of the parameters to a wrapper might be a costed function. It's not uncommon to see:

ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))

It's fair to say that we really do depend on our wrappers getting inlined basically all the time. They are simple functions, they do nothing other than 'SELECT func1() AND func2() AND arg1 && arg2'.

However, once costs are added to the parameters, the inlining can be turned off relatively quickly. Here's a PgSQL native example:

    -- Create data table and index. Analyze.
    DROP TABLE IF EXISTS boxen;
    CREATE TABLE boxen AS 
        SELECT row_number() OVER() As gid, 
        box(point(x, y),point(x+1, y+1)) AS b, x, y
        FROM generate_series(-100,100) As y, generate_series(-100,100) As x;
    CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
    ANALYZE boxen;

    -- An inlined function
    -- When set 'STRICT' it breaks index access
    -- However 'IMMUTABLE' doesn't seem to bother it
    CREATE OR REPLACE FUNCTION good_box(box, box)
        RETURNS boolean
        AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2))) < 3'
        LANGUAGE 'sql';

    -- Start with a low cost circle()
    ALTER FUNCTION circle(point, double precision) COST 1;

    -- [A] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen
    WHERE good_box(
        boxen.b, 
        box(circle(point(20.5, 20.5), 2))
        );

    -- [B] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );

    -- Increase cost of circle
    ALTER FUNCTION circle(point, double precision) COST 100;

    -- [B] Query plan does not hit index!
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );
    
The inlining is getting tossed out on a test of how expensive the function parameters are [2]. As a result, we lose what is really the correct plan, and get a sequence scan instead of an index scan.

The test of parameter cost seems quite old (15+ years) and perhaps didn't anticipate highly variable individual function costs (or maybe it did). As it stands though, PostGIS is currently stuck choosing between having costs on our functions or having our inlined wrappers, because we cannot have both at the same time.

Any thoughts?

Thanks!

P.    
    
        

Re: Inlining functions with "expensive" parameters

От
Paul Ramsey
Дата:

On Thu, Nov 9, 2017 at 12:11 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
All,

As we try and make PostGIS more "parallel sensitive" we have been added costs to our functions, so that their relative CPU cost is more accurately reflected in parallel plans. 

This has resulted in an odd side effect: some of our "wrapper" functions stop giving index scans in plans [1]. This is a problem!

An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an index operation (geom1 && geom2) with an exact spatial test (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and has worked for us well for a decade and more. Having this construct stop working is definitely a problem.

As we add costs to our functions, the odds increase that one of the parameters to a wrapper might be a costed function. It's not uncommon to see:

ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))

It's fair to say that we really do depend on our wrappers getting inlined basically all the time. They are simple functions, they do nothing other than 'SELECT func1() AND func2() AND arg1 && arg2'.

However, once costs are added to the parameters, the inlining can be turned off relatively quickly. Here's a PgSQL native example:

    -- Create data table and index. Analyze.
    DROP TABLE IF EXISTS boxen;
    CREATE TABLE boxen AS 
        SELECT row_number() OVER() As gid, 
        box(point(x, y),point(x+1, y+1)) AS b, x, y
        FROM generate_series(-100,100) As y, generate_series(-100,100) As x;
    CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
    ANALYZE boxen;

    -- An inlined function
    -- When set 'STRICT' it breaks index access
    -- However 'IMMUTABLE' doesn't seem to bother it
    CREATE OR REPLACE FUNCTION good_box(box, box)
        RETURNS boolean
        AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2))) < 3'
        LANGUAGE 'sql';

    -- Start with a low cost circle()
    ALTER FUNCTION circle(point, double precision) COST 1;

    -- [A] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen
    WHERE good_box(
        boxen.b, 
        box(circle(point(20.5, 20.5), 2))
        );

    -- [B] Query plan hits index
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );

    -- Increase cost of circle
    ALTER FUNCTION circle(point, double precision) COST 100;

    -- [B] Query plan does not hit index!
    EXPLAIN SELECT gid 
    FROM boxen, 
        (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c 
    WHERE good_box(
        boxen.b, 
        box(circle(point(c.x, c.y), 2))
        );
    
The inlining is getting tossed out on a test of how expensive the function parameters are [2]. As a result, we lose what is really the correct plan, and get a sequence scan instead of an index scan.

The test of parameter cost seems quite old (15+ years) and perhaps didn't anticipate highly variable individual function costs (or maybe it did). As it stands though, PostGIS is currently stuck choosing between having costs on our functions or having our inlined wrappers, because we cannot have both at the same time.

I personally find it hard to make the case for not inlining all the time, but that's probably a lack of imagination. The functions in question pass all the other tests of "inlinability" the function cost one seems arbitrary.

 

Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
Hi,

On 2017-11-16 09:37:29 -0800, Paul Ramsey wrote:
> On Thu, Nov 9, 2017 at 12:11 PM, Paul Ramsey <pramsey@cleverelephant.ca>
> wrote:
> 
> > All,
> >
> > As we try and make PostGIS more "parallel sensitive" we have been added
> > costs to our functions, so that their relative CPU cost is more accurately
> > reflected in parallel plans.
> >
> > This has resulted in an odd side effect: some of our "wrapper" functions
> > stop giving index scans in plans [1]. This is a problem!
> >
> > An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an
> > index operation (geom1 && geom2) with an exact spatial test
> > (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and
> > has worked for us well for a decade and more. Having this construct stop
> > working is definitely a problem.
> >
> > As we add costs to our functions, the odds increase that one of the
> > parameters to a wrapper might be a costed function. It's not uncommon to
> > see:
> >
> > ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))
> >
> > It's fair to say that we really do depend on our wrappers getting inlined
> > basically all the time. They are simple functions, they do nothing other
> > than 'SELECT func1() AND func2() AND arg1 && arg2'.
> >
> > However, once costs are added to the parameters, the inlining can be
> > turned off relatively quickly. Here's a PgSQL native example:
> >
> >     -- Create data table and index. Analyze.
> >     DROP TABLE IF EXISTS boxen;
> >     CREATE TABLE boxen AS
> >         SELECT row_number() OVER() As gid,
> >         box(point(x, y),point(x+1, y+1)) AS b, x, y
> >         FROM generate_series(-100,100) As y, generate_series(-100,100) As
> > x;
> >     CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
> >     ANALYZE boxen;
> >
> >     -- An inlined function
> >     -- When set 'STRICT' it breaks index access
> >     -- However 'IMMUTABLE' doesn't seem to bother it
> >     CREATE OR REPLACE FUNCTION good_box(box, box)
> >         RETURNS boolean
> >         AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2)))
> > < 3'
> >         LANGUAGE 'sql';
> >
> >     -- Start with a low cost circle()
> >     ALTER FUNCTION circle(point, double precision) COST 1;
> >
> >     -- [A] Query plan hits index
> >     EXPLAIN SELECT gid
> >     FROM boxen
> >     WHERE good_box(
> >         boxen.b,
> >         box(circle(point(20.5, 20.5), 2))
> >         );
> >
> >     -- [B] Query plan hits index
> >     EXPLAIN SELECT gid
> >     FROM boxen,
> >         (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
> >     WHERE good_box(
> >         boxen.b,
> >         box(circle(point(c.x, c.y), 2))
> >         );
> >
> >     -- Increase cost of circle
> >     ALTER FUNCTION circle(point, double precision) COST 100;
> >
> >     -- [B] Query plan does not hit index!
> >     EXPLAIN SELECT gid
> >     FROM boxen,
> >         (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
> >     WHERE good_box(
> >         boxen.b,
> >         box(circle(point(c.x, c.y), 2))
> >         );
> >
> > The inlining is getting tossed out on a test of how expensive the function
> > parameters are [2]. As a result, we lose what is really the correct plan,
> > and get a sequence scan instead of an index scan.
> >
> > The test of parameter cost seems quite old (15+ years) and perhaps didn't
> > anticipate highly variable individual function costs (or maybe it did). As
> > it stands though, PostGIS is currently stuck choosing between having costs
> > on our functions or having our inlined wrappers, because we cannot have
> > both at the same time.
> >
> 
> I personally find it hard to make the case for not inlining all the time,
> but that's probably a lack of imagination. The functions in question pass
> all the other tests of "inlinability" the function cost one seems arbitrary.

Well, it's not a question of cost of the function now? Imagine   SELECT inlineable(something());
if you have 10 references for the parameter inside inlineable(). Then
currently something() would be evaluated 10 times. Which'd quite
possibly be bad.

So I'm not convinced we can just throw out the cost checks and be done
with it.

But what I *am* wondering about, is why we're not handling the
parameters in a different way. Instead of replacing the all parameter
references with the parameter, it shouldn't be too hard to instead
replace them with a new PARAM_EXEC like Param.

Besides reducing the cost barrier for inlining, that should also quite
drastically expand the set of inlinable cases because volatile functions
etc aren't a problem anymore.

Presumably we'd want either to continue replacing single references to
parameters like we currently are. But maybe we could store enough
information and make that just part of expression simplification?

Greetings,

Andres Freund


Re: Inlining functions with "expensive" parameters

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> Well, it's not a question of cost of the function now? Imagine
>     SELECT inlineable(something());
> if you have 10 references for the parameter inside inlineable(). Then
> currently something() would be evaluated 10 times. Which'd quite
> possibly be bad.

Right.  I kind of thought we only worried about that if the parameter
was referenced more than once, but I might be wrong.

> But what I *am* wondering about, is why we're not handling the
> parameters in a different way. Instead of replacing the all parameter
> references with the parameter, it shouldn't be too hard to instead
> replace them with a new PARAM_EXEC like Param.

Yeah, there's no mechanism like that now, but there could be.  I wonder
if we could connect that to the work that was being done for caching
nonvolatile subexpressions --- it feels like much the same problem.
        regards, tom lane


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:

On November 16, 2017 11:44:52 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andres Freund <andres@anarazel.de> writes:
>> Well, it's not a question of cost of the function now? Imagine
>>     SELECT inlineable(something());
>> if you have 10 references for the parameter inside inlineable(). Then
>> currently something() would be evaluated 10 times. Which'd quite
>> possibly be bad.
>
>Right.  I kind of thought we only worried about that if the parameter
>was referenced more than once, but I might be wrong.

You're not.


>> But what I *am* wondering about, is why we're not handling the
>> parameters in a different way. Instead of replacing the all parameter
>> references with the parameter, it shouldn't be too hard to instead
>> replace them with a new PARAM_EXEC like Param.
>
>Yeah, there's no mechanism like that now, but there could be.

Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it. 

> I wonder
>if we could connect that to the work that was being done for caching
>nonvolatile subexpressions --- it feels like much the same problem.

Yes, that sounds like it could be related. Looks like the expression representation and the execution side of things
shouldbe the same. 

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:

On November 16, 2017 11:44:52 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andres Freund <andres@anarazel.de> writes:
>> Well, it's not a question of cost of the function now? Imagine
>>     SELECT inlineable(something());
>> if you have 10 references for the parameter inside inlineable(). Then
>> currently something() would be evaluated 10 times. Which'd quite
>> possibly be bad.
>
>Right.  I kind of thought we only worried about that if the parameter
>was referenced more than once, but I might be wrong.

You're not.


>> But what I *am* wondering about, is why we're not handling the
>> parameters in a different way. Instead of replacing the all parameter
>> references with the parameter, it shouldn't be too hard to instead
>> replace them with a new PARAM_EXEC like Param.
>
>Yeah, there's no mechanism like that now, but there could be.

Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it. 

> I wonder
>if we could connect that to the work that was being done for caching
>nonvolatile subexpressions --- it feels like much the same problem.

Yes, that sounds like it could be related. Looks like the expression representation and the execution side of things
shouldbe the same. 

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: Inlining functions with "expensive" parameters

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On November 16, 2017 11:44:52 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, there's no mechanism like that now, but there could be.

> Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it. 

Yeah.  If you also gave it the ability to optionally enforce strictness
(ie, skip child expr and return NULL if any param is NULL) then we could
do away with all of the parameter-based restrictions on inlining, since
the semantics of parameter eval wouldn't change by inlining.

I might be showing my grad school background here, but I'd be inclined to
call it a LambdaExpr.  A name based on "with" would be fine in a green
field, but IMO we've got too much baggage from nodes related to SQL WITH.
        regards, tom lane


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
Hi,

On 2017-11-16 15:27:59 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On November 16, 2017 11:44:52 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, there's no mechanism like that now, but there could be.
> 
> > Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it.
 
> 
> Yeah.  If you also gave it the ability to optionally enforce strictness
> (ie, skip child expr and return NULL if any param is NULL) then we could
> do away with all of the parameter-based restrictions on inlining, since
> the semantics of parameter eval wouldn't change by inlining.

Yep. And we quite easily can have a fastpath execution step that skips
these checks if no needed.

I suspect there might still be some cases where it's worth either using
the current parameter replacement, or rely on eval_const_expressions
based infrastructure to directly "inline" reference parameters if
safe. The latter seems a bit nicer / more extensible.


> I might be showing my grad school background here, but I'd be inclined to
> call it a LambdaExpr.  A name based on "with" would be fine in a green
> field, but IMO we've got too much baggage from nodes related to SQL WITH.

That'd work as well for me.

Greetings,

Andres Freund


Re: Inlining functions with "expensive" parameters

От
Paul Ramsey
Дата:


On Thu, Nov 16, 2017 at 12:37 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2017-11-16 15:27:59 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On November 16, 2017 11:44:52 AM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, there's no mechanism like that now, but there could be.
>
> > Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first evaluates parameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less issues with the setting of the param value being moved away from the child expression using it.
>
> Yeah.  If you also gave it the ability to optionally enforce strictness
> (ie, skip child expr and return NULL if any param is NULL) then we could
> do away with all of the parameter-based restrictions on inlining, since
> the semantics of parameter eval wouldn't change by inlining.

Yep. And we quite easily can have a fastpath execution step that skips
these checks if no needed.

I suspect there might still be some cases where it's worth either using
the current parameter replacement, or rely on eval_const_expressions
based infrastructure to directly "inline" reference parameters if
safe. The latter seems a bit nicer / more extensible.


> I might be showing my grad school background here, but I'd be inclined to
> call it a LambdaExpr.  A name based on "with" would be fine in a green
> field, but IMO we've got too much baggage from nodes related to SQL WITH.

That'd work as well for me.

Is there a github branch or an active patch set where this work is going on that I could watch and learn from? 
Thanks!
P
 
Greetings,

Andres Freund

Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:

On November 17, 2017 5:15:57 AM PST, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>On Thu, Nov 16, 2017 at 12:37 PM, Andres Freund <andres@anarazel.de>
>wrote:
>
>> Hi,
>>
>> On 2017-11-16 15:27:59 -0500, Tom Lane wrote:
>> > Andres Freund <andres@anarazel.de> writes:
>> > > On November 16, 2017 11:44:52 AM PST, Tom Lane
><tgl@sss.pgh.pa.us>
>> wrote:
>> > >> Yeah, there's no mechanism like that now, but there could be.
>> >
>> > > Right, but it doesn't sound that hard to introduce. Basically
>there'd
>> need to be a WithParamValue node,  that first evaluates parameters
>and then
>> executes the child expression. I'm thinking of doing this
>hierarchically so
>> there's less issues with the setting of the param value being moved
>away
>> from the child expression using it.
>> >
>> > Yeah.  If you also gave it the ability to optionally enforce
>strictness
>> > (ie, skip child expr and return NULL if any param is NULL) then we
>could
>> > do away with all of the parameter-based restrictions on inlining,
>since
>> > the semantics of parameter eval wouldn't change by inlining.
>>
>> Yep. And we quite easily can have a fastpath execution step that
>skips
>> these checks if no needed.
>>
>> I suspect there might still be some cases where it's worth either
>using
>> the current parameter replacement, or rely on eval_const_expressions
>> based infrastructure to directly "inline" reference parameters if
>> safe. The latter seems a bit nicer / more extensible.
>>
>>
>> > I might be showing my grad school background here, but I'd be
>inclined to
>> > call it a LambdaExpr.  A name based on "with" would be fine in a
>green
>> > field, but IMO we've got too much baggage from nodes related to SQL
>WITH.
>>
>> That'd work as well for me.
>>
>
>Is there a github branch or an active patch set where this work is
>going on
>that I could watch and learn from?
>Thanks!

Oh, I wasn't planning to work in this anytime soon. From my end this was just spitballing how a solution might look
like.

If you want to take a stab...

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: Inlining functions with "expensive" parameters

От
Robert Haas
Дата:
On Thu, Nov 16, 2017 at 2:51 PM, Andres Freund <andres@anarazel.de> wrote:
> Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it. 

I don't quite follow the need for this.  I mean, if we just stick a
Param reference in there and create a corresponding InitPlan, the
Param will be evaluated on demand, right?  Is the point of the new
node to make sure that the Param gets re-evaluated when needed?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Inlining functions with "expensive" parameters

От
Robert Haas
Дата:
On Thu, Nov 16, 2017 at 2:51 PM, Andres Freund <andres@anarazel.de> wrote:
> Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it. 

I don't quite follow the need for this.  I mean, if we just stick a
Param reference in there and create a corresponding InitPlan, the
Param will be evaluated on demand, right?  Is the point of the new
node to make sure that the Param gets re-evaluated when needed?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Inlining functions with "expensive" parameters

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Nov 16, 2017 at 2:51 PM, Andres Freund <andres@anarazel.de> wrote:
>> Right, but it doesn't sound that hard to introduce. Basically there'd
>> need to be a WithParamValue node,  that first evaluates parameters and
>> then executes the child expression.

> I don't quite follow the need for this.  I mean, if we just stick a
> Param reference in there and create a corresponding InitPlan, the
> Param will be evaluated on demand, right?  Is the point of the new
> node to make sure that the Param gets re-evaluated when needed?

Yeah.  By default, an initplan is only going to be evaluated once per
query, not once per row.  It's possible that you could set up Param
dependencies to force the correct semantics, but that would be more
complicated, and probably less performant, than introducing a new
expression node type.  Also, I'm dubious that it could be made to work
at all for standalone expression evaluation; Param-driven re-evaluation
is only considered when running a plan tree.
        regards, tom lane


Re: Inlining functions with "expensive" parameters

От
Tom Lane
Дата:
I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I don't quite follow the need for this.  I mean, if we just stick a
>> Param reference in there and create a corresponding InitPlan, the
>> Param will be evaluated on demand, right?  Is the point of the new
>> node to make sure that the Param gets re-evaluated when needed?

> Yeah.  By default, an initplan is only going to be evaluated once per
> query, not once per row.  It's possible that you could set up Param
> dependencies to force the correct semantics, but that would be more
> complicated, and probably less performant, than introducing a new
> expression node type.  Also, I'm dubious that it could be made to work
> at all for standalone expression evaluation; Param-driven re-evaluation
> is only considered when running a plan tree.

BTW, thinking about that a bit more, standalone expressions (eg CHECK
constraints) are possibly going to need some work anyway to make this go
through.  I do not think there's any mechanism in place at the moment
to set up a ParamListInfo array for such an expression.  While it's
conceivable that the expression eval machinery could be set up to not
need one, I'm not real sure, if we try to make these things be
PARAM_EXEC Params.

This suggests that maybe the Params used for LambdaExprs should be a
distinct kind of Param, not the same PARAM_EXEC type that's used for
Params supplied externally to eval of a particular expression.  That
would reflect the fact that expression eval handles them completely
internally.

I'm also wondering about folding CaseTestExpr and CoerceToDomainValue
into the same mechanism.  It's not very hard to see those cases as
being the same as a function-based lambda.
        regards, tom lane


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
On 2017-11-21 09:59:00 -0500, Robert Haas wrote:
> On Thu, Nov 16, 2017 at 2:51 PM, Andres Freund <andres@anarazel.de> wrote:
> > Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it.
 
> 
> I don't quite follow the need for this.  I mean, if we just stick a
> Param reference in there and create a corresponding InitPlan, the
> Param will be evaluated on demand, right?  Is the point of the new
> node to make sure that the Param gets re-evaluated when needed?

It'll work in some of those cases. But you e.g. can't use an InitPlan to
reference to a table's columns without some major contortions, no?
Inlining stuff like SELECT * FROM foo WHERE sql_func(foo.blarg, anotherfunc());
is pretty important for some usecases. The re-evaluation concern's also
there, this should also work with e.g. volatile parameters like
nextval(), without establishing a second mechanism to deal with them.

Greetings,

Andres Freund


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
On 2017-11-21 09:59:00 -0500, Robert Haas wrote:
> On Thu, Nov 16, 2017 at 2:51 PM, Andres Freund <andres@anarazel.de> wrote:
> > Right, but it doesn't sound that hard to introduce. Basically there'd need to be a WithParamValue node,  that first
evaluatesparameters and then executes the child expression. I'm thinking of doing this hierarchically so there's less
issueswith the setting of the param value being moved away from the child expression using it.
 
> 
> I don't quite follow the need for this.  I mean, if we just stick a
> Param reference in there and create a corresponding InitPlan, the
> Param will be evaluated on demand, right?  Is the point of the new
> node to make sure that the Param gets re-evaluated when needed?

It'll work in some of those cases. But you e.g. can't use an InitPlan to
reference to a table's columns without some major contortions, no?
Inlining stuff like SELECT * FROM foo WHERE sql_func(foo.blarg, anotherfunc());
is pretty important for some usecases. The re-evaluation concern's also
there, this should also work with e.g. volatile parameters like
nextval(), without establishing a second mechanism to deal with them.

Greetings,

Andres Freund


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
Hi,

On 2017-11-21 10:16:58 -0500, Tom Lane wrote:
> I'm also wondering about folding CaseTestExpr and CoerceToDomainValue
> into the same mechanism.  It's not very hard to see those cases as
> being the same as a function-based lambda.

Yea, that'd be good. The current mechanisms is uh, historically grown.

- Andres


Re: Inlining functions with "expensive" parameters

От
Andres Freund
Дата:
Hi,

On 2017-11-21 10:16:58 -0500, Tom Lane wrote:
> I'm also wondering about folding CaseTestExpr and CoerceToDomainValue
> into the same mechanism.  It's not very hard to see those cases as
> being the same as a function-based lambda.

Yea, that'd be good. The current mechanisms is uh, historically grown.

- Andres