Обсуждение: query - laziness of lateral join with function

Поиск
Список
Период
Сортировка

query - laziness of lateral join with function

От
paulcc
Дата:
Hi

I'm using cross join lateral with a non-trivial function in
an attempt to limit calculation of that function, and am
wondering about some aspects of how lateral is currently
implemented.

NB these queries are generated by a certain ORM, and are
usually embedded in much more complex queries...


Case one: counting

   select count(alpha.id)
   from alpha
   cross join lateral some_function(alpha.id) as some_val
   where alpha.test

   Here the function is strict, and moreover its argument will never
   be null - hence there should always be a non-null value returned.

   I would expect that since the function doesn't impact on the
   number of rows (always one value returned for each row in alpha),
   then I'd hope the function is never called. EXPLAIN shows it being
   called for each row in the main table.


Case two: pagination

   select alpha.*, some_val
   from alpha
   cross join lateral some_function(alpha.id) as some_val
   where alpha.test
   order by alpha.name asc
   limit 100 offset 100

   Same setup as above, and I'd expect that the ordering and
   selection of rows can be done first and the function only
   called on the rows that get selected. Again, EXPLAIN shows
   otherwise.




So: am I expecting too much for LATERAL, or have I missed a
trick somewhere?

Many thanks in advance!

Paul




--
View this message in context: http://postgresql.nabble.com/query-laziness-of-lateral-join-with-function-tp5837706.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: query - laziness of lateral join with function

От
Tom Lane
Дата:
paulcc <paulcc.two@gmail.com> writes:
>    select count(alpha.id)
>    from alpha
>    cross join lateral some_function(alpha.id) as some_val
>    where alpha.test

>    Here the function is strict, and moreover its argument will never
>    be null - hence there should always be a non-null value returned.

>    I would expect that since the function doesn't impact on the
>    number of rows (always one value returned for each row in alpha),
>    then I'd hope the function is never called. EXPLAIN shows it being
>    called for each row in the main table.

You're out of luck on that one at the moment, although testing it on
HEAD suggests that commit 55d5b3c08279b487cfa44d4b6e6eea67a0af89e4
might have fixed it for you in future releases.

>    select alpha.*, some_val
>    from alpha
>    cross join lateral some_function(alpha.id) as some_val
>    where alpha.test
>    order by alpha.name asc
>    limit 100 offset 100

>    Same setup as above, and I'd expect that the ordering and
>    selection of rows can be done first and the function only
>    called on the rows that get selected.

The planner might produce such a result if there's an opportunity
to perform the sorting via an index on "alpha" (ie, the ORDER BY
matches some index).  If it has to do an explicit sort it's gonna
do the join first.

(If you have such an index, and it's not going for the plan you want,
you might need to crank up the COST property of some_function to
persuade the planner that it should try to minimize the number of calls
even if that means a slower scan choice.)

In both cases though, I rather wonder why you're using LATERAL at all, as
opposed to just calling the function in the main query when you want its
result.  The query planner can't be expected to make up for arbitrary
amounts of stupidity in the formulation of the submitted query.

            regards, tom lane


Re: query - laziness of lateral join with function

От
David G Johnston
Дата:
Tom Lane-2 wrote
> paulcc <

> paulcc.two@

> > writes:
>>    select count(alpha.id)
>>    from alpha
>>    cross join lateral some_function(alpha.id) as some_val
>>    where alpha.test
>
>>    Here the function is strict, and moreover its argument will never
>>    be null - hence there should always be a non-null value returned.
>
> In both cases though, I rather wonder why you're using LATERAL at all, as
> opposed to just calling the function in the main query when you want its
> result.  The query planner can't be expected to make up for arbitrary
> amounts of stupidity in the formulation of the submitted query.

I'm trying to answer this with a bit more detail but cannot because the OP
provided too little information which is then causing Tom to make
assumptions.  I'm not sure to what degree the ORM is being stupid here since
I do not know why it thinks LATERAL is more appropriate than a select-list
function call for a non-SRF function (which I have to presume this is, but
it is not stated).

With respect to "the function will never return NULL": this is not the
issue.  The issue is that the function could return nothing (i.e., zero
records) in which case the CROSS JOIN would suppress the corresponding
correlated row from the result.

Non-SRF functions are more easily used within the select-list of the query
instead of attached to a LATERAL clause; the only issue there is when the
function returns a composite and you try to immediately explode it into its
constituent parts - the function will be evaluated multiple times.  I'm not
sure if that is what Tom is saying above but the combination of that
limitation and limited optimizations if the function is in LATERAL seems to
be in conflict here.

There has been a recent uptick in interest in making PostgreSQL more ORM
friendly (i.e., more able to simply ignore stuff that is added to the query
even though a particular call doesn't actually need it) but I haven't seen
anyone looking into LATERAL.  More detailed reports may at least bring
exposure to what is being used in the wild and garner interest from other
parties in improving things.  Unfortunately this report is too limited to
really make a dent; lacking even the name of the ORM that is being used and
the entire queries that are being generated - and why.

David J.




--
View this message in context:
http://postgresql.nabble.com/query-laziness-of-lateral-join-with-function-tp5837706p5837735.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: query - laziness of lateral join with function

От
Paul Callaghan
Дата:


On Feb 12, 2015 9:17 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> The planner might produce such a result if there's an opportunity
> to perform the sorting via an index on "alpha" (ie, the ORDER BY
> matches some index).  If it has to do an explicit sort it's gonna
> do the join first.
>
> (If you have such an index, and it's not going for the plan you want,
> you might need to crank up the COST property of some_function to
> persuade the planner that it should try to minimize the number of calls
> even if that means a slower scan choice.)
>
> In both cases though, I rather wonder why you're using LATERAL at all, as
> opposed to just calling the function in the main query when you want its
> result.  The query planner can't be expected to make up for arbitrary
> amounts of stupidity in the formulation of the submitted query.

Useful, many thanks. I'll try playing with cost changes and a more targeted index.

In my real code, the function actually returns a json hash from which several fields are extracted in the main select (why? Unpacking in the query saves some hassle in the app code...) So my plan was to use lateral to limit function calls to max once per row. Is there a better way, other than using a nested query?