Обсуждение: Aggregate functions with FROM clause and ROW_COUNT diagnostics

Поиск
Список
Период
Сортировка

Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Alexey Dokuchaev
Дата:
Hi,

I'm seeing somewhat confusing results here with 9.6.8, and cannot find
the answer in the docs or google.

I'm returning JSON array (or any array, it does not make a difference)
from my plpgsql function like this:

    OUT retcode int,
    OUT result json)
    . . .
    result := json_agg(_) FROM (
      SELECT foo, bar, baz ...
      FROM t1, t2, t3 WHERE ...) AS _;        -- this works fine

    GET DIAGNOSTICS retcode = ROW_COUNT;    -- always returns 1

I'd expected `retcode' to contain the number of SELECT'ed rows, but it
is something else (always 1).  Apparently, aggregation functions like
json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the
one I'm interested in).

Is this expected and correct behavior?  Is it possible to obtain the
first ROW_COUNT (after SELECT) without performing it twice?  Thanks,

./danfe


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
"David G. Johnston"
Дата:
On Mon, May 21, 2018 at 5:54 AM, Alexey Dokuchaev <danfe@nsu.ru> wrote:
    result := json_agg(_) FROM (
      SELECT foo, bar, baz ...
      FROM t1, t2, t3 WHERE ...) AS _;          -- this works fine

    GET DIAGNOSTICS retcode = ROW_COUNT;        -- always returns 1

I'd expected `retcode' to contain the number of SELECT'ed rows, but it
is something else (always 1).  Apparently, aggregation functions like
json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the
one I'm interested in).

Is this expected and correct behavior?

​Yes, the query returned only one row, with a single json column.  You wrote the equivalent of:

SELECT json_agg(...) FROM ... INTO result;

And you are getting the count of the top-most select (which is implied in the syntax that you used).
 
  Is it possible to obtain the
first ROW_COUNT (after SELECT) without performing it twice?

​Not directly, no.  You should execute the inner query to a temporary table than perform your counting and json_agg from that.

David J.

Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Adrian Klaver
Дата:
On 05/21/2018 05:54 AM, Alexey Dokuchaev wrote:
> Hi,
> 
> I'm seeing somewhat confusing results here with 9.6.8, and cannot find
> the answer in the docs or google.
> 
> I'm returning JSON array (or any array, it does not make a difference)
> from my plpgsql function like this:
> 
>      OUT retcode int,
>      OUT result json)
>      . . .
>      result := json_agg(_) FROM (
>        SELECT foo, bar, baz ...
>        FROM t1, t2, t3 WHERE ...) AS _;        -- this works fine
> 
>      GET DIAGNOSTICS retcode = ROW_COUNT;    -- always returns 1
> 
> I'd expected `retcode' to contain the number of SELECT'ed rows, but it
> is something else (always 1).  Apparently, aggregation functions like
> json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the
> one I'm interested in).
> 
> Is this expected and correct behavior?  Is it possible to obtain the
> first ROW_COUNT (after SELECT) without performing it twice?  Thanks,

Off the top of my head:

  SELECT count(*) as ct, foo, bar, baz ...

retcode = result ->'ct'

> 
> ./danfe
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
>> Is it possible to obtain the
>> first ROW_COUNT (after SELECT) without performing it twice?

> ​Not directly, no.  You should execute the inner query to a temporary table
> than perform your counting and json_agg from that.

OP could do something like

        declare r record;
        ...
        select json_agg(_) as j, count(*) as c INTO r FROM (
          SELECT foo, bar, baz ...
          FROM t1, t2, t3 WHERE ...) AS _;

This would be slightly more expensive than doing only the one aggregate,
but it should beat anything involving a temp table.

            regards, tom lane


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Tom Lane
Дата:
Alexey Dokuchaev <danfe@nsu.ru> writes:
> On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote:
>> OP could do something like
>>     select json_agg(_) as j, count(*) as c INTO r FROM (

> Thanks, I've arrived at the same solution (using local RECORD) eventually.
> It works as intended, but I still need to assign OUT parameters by hand:

>     retcode := tmp.c;
>     result := tmp.j;

Actually, it should work to do

    select json_agg(_), count(*) INTO result, retcode FROM ...

            regards, tom lane


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Alexey Dokuchaev
Дата:
On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote:
> OP could do something like
> 
>     declare r record;
>     ...
>     select json_agg(_) as j, count(*) as c INTO r FROM (
>       SELECT foo, bar, baz ...
>       FROM t1, t2, t3 WHERE ...) AS _;
> 
> This would be slightly more expensive than doing only the one aggregate,
> but it should beat anything involving a temp table.

Thanks, I've arrived at the same solution (using local RECORD) eventually.
It works as intended, but I still need to assign OUT parameters by hand:

    retcode := tmp.c;
    result := tmp.j;

I'd love to get rid of OUT parameters and return RECORD directly (it also
would make the code more "functional" in style), but then I'd be getting
this annoying ``a column definition list is required for functions
returning "record"'' error which I don't know how to avoid.  Using OUT's
simplifies things for callers (they don't have to annotate the types by
hand) while still preserving type safety.

On Mon, May 21, 2018 at 07:30:44AM -0700, David G. Johnston wrote:
> > ...
> > Is this expected and correct behavior?
>
> Yes, the query returned only one row, with a single json column.  You
> wrote the equivalent of:
>
>     SELECT json_agg(...) FROM ... INTO result;
>
> And you are getting the count of the top-most select (which is implied
> in the syntax that you used).

I see, thanks; I've neglected the fact that there's still another implicit
SELECT being involved.

./danfe


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Tom Lane
Дата:
Alexey Dokuchaev <danfe@nsu.ru> writes:
> Quick reality check
> question: are count(*) vs. count(_) equivalent above?

Only if _ is guaranteed non-null ... which, as a rowtype result, it
probably is.  But I'd use count(*) if you're concerned about speed.

            regards, tom lane


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Alexey Dokuchaev
Дата:
On Mon, May 21, 2018 at 12:20:52PM -0400, Tom Lane wrote:
> Alexey Dokuchaev <danfe@nsu.ru> writes:
> > On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote:
> >> OP could do something like
> >>     select json_agg(_) as j, count(*) as c INTO r FROM (
> 
> > Thanks, I've arrived at the same solution (using local RECORD) eventually.
> > It works as intended, but I still need to assign OUT parameters by hand:
> 
> >     retcode := tmp.c;
> >     result := tmp.j;
> 
> Actually, it should work to do
> 
>     select json_agg(_), count(*) INTO result, retcode FROM ...

Bingo!  Thanks Tom, you're the best (as always).  Quick reality check
question: are count(*) vs. count(_) equivalent above?  I vaguely recall
that count(1) got somewhat slower than count(*) after 8.2, but cannot
back it up right now.

./danfe


Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

От
Alexey Dokuchaev
Дата:
On Mon, May 21, 2018 at 12:54:56PM -0400, Tom Lane wrote:
> Alexey Dokuchaev <danfe@nsu.ru> writes:
> > Quick reality check question: are count(*) vs. count(_) equivalent
> > above?
> 
> Only if _ is guaranteed non-null ... which, as a rowtype result, it
> probably is.  But I'd use count(*) if you're concerned about speed.

Understood.  Yes, my primary concern (after correctness) is speed.  Thank
you all again for very helpful and prompt replies guys.

./danfe