Re: explain HashAggregate to report bucket and memory stats

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: explain HashAggregate to report bucket and memory stats
Дата
Msg-id 20200309173355.bdnj33ffm7xuwquy@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: explain HashAggregate to report bucket and memory stats  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
Hi,

On 2020-03-06 15:33:10 -0600, Justin Pryzby wrote:
> On Fri, Mar 06, 2020 at 09:58:59AM -0800, Andres Freund wrote:
> > > +            ExplainIndentText(es);
> > > +            appendStringInfo(es->str,
> > > +                        "Buckets: %ld (originally %ld)",
> > > +                        inst->nbuckets,
> > > +                        inst->nbuckets_original);
> > 
> > I'm not sure I like the alternative output formats here. All the other
> > fields are separated with a comma, but the original size is in
> > parens. I'd probably just format it as "Buckets: %lld " and then add
> > ", Original Buckets: %lld" when differing.
> 
> It's done that way for consistency with hashJoin in show_hash_info().

Fair. I don't like it, but it's not this patch's fault.


> > > diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
> > > index f457b5b150..b173b32cab 100644
> > > --- a/src/test/regress/expected/aggregates.out
> > > +++ b/src/test/regress/expected/aggregates.out
> > > @@ -517,10 +517,11 @@ order by 1, 2;
> > >           ->  HashAggregate
> > >                 Output: s2.s2, sum((s1.s1 + s2.s2))
> > >                 Group Key: s2.s2
> > > +               Buckets: 4
> > >                 ->  Function Scan on pg_catalog.generate_series s2
> > >                       Output: s2.s2
> > >                       Function Call: generate_series(1, 3)
> > > -(14 rows)
> > > +(15 rows)
> > 
> > These tests probably won't be portable. The number of hash buckets
> > calculated will e.g. depend onthe size of the contained elements. And
> > that'll e.g. will depend on whether pointers are 4 or 8 bytes.
> 
> I was aware and afraid of that.  Previously, I added this output only to
> "explain analyze", and (as an quick, interim implementation) changed various
> tests to use analyze, and memory only shown in "verbose" mode.  But as Tomas
> pointed out, that's consistent with what's done elsewhere.
> 
> So is the solution to show stats only during explain ANALYZE ?
> 
> Or ... I have a patch to create a new explain(MACHINE) option to allow more
> stable output, by avoiding Memory/Disk.  That doesn't attempt to make all
> "explain analyze" output stable - there's other issues, I think mostly related
> to parallel workers (see 4ea03f3f, 13e8b2ee).  But does allow retiring
> explain_sq_limit and explain_parallel_sort_stats.  I'm including my patch to
> show what I mean, but I didn't enable it for hashtable "Buckets:".  I guess in
> either case, the tests shouldn't be included.

Yea, there's been recent discussion about an argument like that. See
e.g.
https://www.postgresql.org/message-id/18494.1580079189%40sss.pgh.pa.us

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: DROP and ddl_command_end.
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Fastpath while arranging the changes in LSN order in logicaldecoding