Обсуждение: Aggregate functions with FROM clause and ROW_COUNT diagnostics
Hi, I'm seeing somewhat confusing results here with 9.6.8, and cannot find the answer in the docs or google. I'm returning JSON array (or any array, it does not make a difference) from my plpgsql function like this: OUT retcode int, OUT result json) . . . result := json_agg(_) FROM ( SELECT foo, bar, baz ... FROM t1, t2, t3 WHERE ...) AS _; -- this works fine GET DIAGNOSTICS retcode = ROW_COUNT; -- always returns 1 I'd expected `retcode' to contain the number of SELECT'ed rows, but it is something else (always 1). Apparently, aggregation functions like json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the one I'm interested in). Is this expected and correct behavior? Is it possible to obtain the first ROW_COUNT (after SELECT) without performing it twice? Thanks, ./danfe
result := json_agg(_) FROM (
SELECT foo, bar, baz ...
FROM t1, t2, t3 WHERE ...) AS _; -- this works fine
GET DIAGNOSTICS retcode = ROW_COUNT; -- always returns 1
I'd expected `retcode' to contain the number of SELECT'ed rows, but it
is something else (always 1). Apparently, aggregation functions like
json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the
one I'm interested in).
Is this expected and correct behavior?
Yes, the query returned only one row, with a single json column. You wrote the equivalent of:
SELECT json_agg(...) FROM ... INTO result;
And you are getting the count of the top-most select (which is implied in the syntax that you used).
Is it possible to obtain the
first ROW_COUNT (after SELECT) without performing it twice?
Not directly, no. You should execute the inner query to a temporary table than perform your counting and json_agg from that.
David J.
On 05/21/2018 05:54 AM, Alexey Dokuchaev wrote: > Hi, > > I'm seeing somewhat confusing results here with 9.6.8, and cannot find > the answer in the docs or google. > > I'm returning JSON array (or any array, it does not make a difference) > from my plpgsql function like this: > > OUT retcode int, > OUT result json) > . . . > result := json_agg(_) FROM ( > SELECT foo, bar, baz ... > FROM t1, t2, t3 WHERE ...) AS _; -- this works fine > > GET DIAGNOSTICS retcode = ROW_COUNT; -- always returns 1 > > I'd expected `retcode' to contain the number of SELECT'ed rows, but it > is something else (always 1). Apparently, aggregation functions like > json_agg()/array_agg() mangle the ROW_COUNT from the inner SELECT (the > one I'm interested in). > > Is this expected and correct behavior? Is it possible to obtain the > first ROW_COUNT (after SELECT) without performing it twice? Thanks, Off the top of my head: SELECT count(*) as ct, foo, bar, baz ... retcode = result ->'ct' > > ./danfe > > -- Adrian Klaver adrian.klaver@aklaver.com
"David G. Johnston" <david.g.johnston@gmail.com> writes: >> Is it possible to obtain the >> first ROW_COUNT (after SELECT) without performing it twice? > Not directly, no. You should execute the inner query to a temporary table > than perform your counting and json_agg from that. OP could do something like declare r record; ... select json_agg(_) as j, count(*) as c INTO r FROM ( SELECT foo, bar, baz ... FROM t1, t2, t3 WHERE ...) AS _; This would be slightly more expensive than doing only the one aggregate, but it should beat anything involving a temp table. regards, tom lane
Alexey Dokuchaev <danfe@nsu.ru> writes: > On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote: >> OP could do something like >> select json_agg(_) as j, count(*) as c INTO r FROM ( > Thanks, I've arrived at the same solution (using local RECORD) eventually. > It works as intended, but I still need to assign OUT parameters by hand: > retcode := tmp.c; > result := tmp.j; Actually, it should work to do select json_agg(_), count(*) INTO result, retcode FROM ... regards, tom lane
On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote: > OP could do something like > > declare r record; > ... > select json_agg(_) as j, count(*) as c INTO r FROM ( > SELECT foo, bar, baz ... > FROM t1, t2, t3 WHERE ...) AS _; > > This would be slightly more expensive than doing only the one aggregate, > but it should beat anything involving a temp table. Thanks, I've arrived at the same solution (using local RECORD) eventually. It works as intended, but I still need to assign OUT parameters by hand: retcode := tmp.c; result := tmp.j; I'd love to get rid of OUT parameters and return RECORD directly (it also would make the code more "functional" in style), but then I'd be getting this annoying ``a column definition list is required for functions returning "record"'' error which I don't know how to avoid. Using OUT's simplifies things for callers (they don't have to annotate the types by hand) while still preserving type safety. On Mon, May 21, 2018 at 07:30:44AM -0700, David G. Johnston wrote: > > ... > > Is this expected and correct behavior? > > Yes, the query returned only one row, with a single json column. You > wrote the equivalent of: > > SELECT json_agg(...) FROM ... INTO result; > > And you are getting the count of the top-most select (which is implied > in the syntax that you used). I see, thanks; I've neglected the fact that there's still another implicit SELECT being involved. ./danfe
Alexey Dokuchaev <danfe@nsu.ru> writes: > Quick reality check > question: are count(*) vs. count(_) equivalent above? Only if _ is guaranteed non-null ... which, as a rowtype result, it probably is. But I'd use count(*) if you're concerned about speed. regards, tom lane
On Mon, May 21, 2018 at 12:20:52PM -0400, Tom Lane wrote: > Alexey Dokuchaev <danfe@nsu.ru> writes: > > On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote: > >> OP could do something like > >> select json_agg(_) as j, count(*) as c INTO r FROM ( > > > Thanks, I've arrived at the same solution (using local RECORD) eventually. > > It works as intended, but I still need to assign OUT parameters by hand: > > > retcode := tmp.c; > > result := tmp.j; > > Actually, it should work to do > > select json_agg(_), count(*) INTO result, retcode FROM ... Bingo! Thanks Tom, you're the best (as always). Quick reality check question: are count(*) vs. count(_) equivalent above? I vaguely recall that count(1) got somewhat slower than count(*) after 8.2, but cannot back it up right now. ./danfe
On Mon, May 21, 2018 at 12:54:56PM -0400, Tom Lane wrote: > Alexey Dokuchaev <danfe@nsu.ru> writes: > > Quick reality check question: are count(*) vs. count(_) equivalent > > above? > > Only if _ is guaranteed non-null ... which, as a rowtype result, it > probably is. But I'd use count(*) if you're concerned about speed. Understood. Yes, my primary concern (after correctness) is speed. Thank you all again for very helpful and prompt replies guys. ./danfe