Re: explain HashAggregate to report bucket and memory stats

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: explain HashAggregate to report bucket and memory stats
Дата
Msg-id 20200216000220.GF31889@telsasoft.com
обсуждение исходный текст
Ответ на Re: explain HashAggregate to report bucket and memory stats  (Andres Freund <andres@anarazel.de>)
Ответы Re: explain HashAggregate to report bucket and memory stats  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On Mon, Feb 03, 2020 at 06:53:01AM -0800, Andres Freund wrote:
> On 2020-01-03 10:19:26 -0600, Justin Pryzby wrote:
> > On Sun, Feb 17, 2019 at 11:29:56AM -0500, Jeff Janes wrote:
> > https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
> > > What would I find very useful is [...] if the HashAggregate node under
> > > "explain analyze" would report memory and bucket stats; and if the Aggregate
> > > node would report...anything.
> 
> Yea, that'd be amazing. It probably should be something every
> execGrouping.c using node can opt into.

Do you think it should be implemented in execGrouping/TupleHashTableData (as I
did) ?  I also did an experiment moving into the higher level nodes, but I
guess that's not actually desirable.  There's currently different output from
tests between the implementation using execGrouping.c and the one outside it,
so there's at least an issue with grouping sets.

> > +    hashtable->hinstrument.nbuckets_original = nbuckets;
> > +    hashtable->hinstrument.nbuckets = nbuckets;
> > +    hashtable->hinstrument.space_peak = entrysize * hashtable->hashtab->size;
> 
> That's not actually an accurate accounting of memory, because for filled
> entries a lot of memory is used to store actual tuples:

Thanks - I think I finally understood this.

I updated some existing tests to show the new output.  I imagine that's a
throwaway commit, and should eventually add new tests for each of these node
types under explain analyze.

I've been testing the various nodes like:

--heapscan:
DROP TABLE t; CREATE TABLE t (i int unique) WITH(autovacuum_enabled=off); INSERT INTO t SELECT
generate_series(1,99999);SET enable_seqscan=off; SET parallel_tuple_cost=0; SET parallel_setup_cost=0; SET
enable_indexonlyscan=off;explain analyze verbose SELECT * FROM t WHERE i BETWEEN 999 and 99999999;
 

--setop:
explain( analyze,verbose) SELECT * FROM generate_series(1,999) EXCEPT (SELECT NULL UNION ALL SELECT * FROM
generate_series(1,99999));
   Buckets: 2048 (originally 256)  Memory Usage: hashtable: 48kB, tuples: 8Kb

--recursive union:
explain analyze verbose WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION SELECT n || ' bar' FROM t WHERE length(n) < 9999)
SELECTn, n IS OF (text) AS is_text FROM t;
 

--subplan
explain analyze verbose SELECT i FROM generate_series(1,999)i WHERE (i,i) NOT IN (SELECT 1,1 UNION ALL SELECT j,j FROM
generate_series(1,99999)j);
   Buckets: 262144 (originally 131072)  Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain analyze verbose select i FROM generate_series(1,999)i WHERE(1,i) NOT in (select i,null::int from t) ;

--Agg:
explain (analyze,verbose) SELECT A,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
   Buckets: 262144 (originally 256)  Memory Usage: hashtable: 6144kB, tuples: 782Kb

explain (analyze, verbose) select i FROM generate_series(1,999)i WHERE(1,1) not in (select a,null from (SELECT
generate_series(1,99999)a)x) ;
 

explain analyze verbose select * from (SELECT a FROM generate_series(1,99)a)v left join lateral (select v.a, four, ten,
count(*)from (SELECT b four, 2 ten, b FROM generate_series(1,999)b)x group by cube(four,ten)) s on true order by
v.a,four,ten;

--Grouping sets:
explain analyze verbose   select unique1,
         count(two), count(four), count(ten),
         count(hundred), count(thousand), count(twothousand),
         count(*)
    from tenk1 group by grouping sets (unique1,twothousand,thousand,hundred,ten,four,two);

-- 
Justin

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Just for fun: Postgres 20?
Следующее
От: Dave Cramer
Дата:
Сообщение: Re: New messages from Priscilla Ip