Обсуждение: pg13dev: explain partial, parallel hashagg, and memory use

Поиск
Список
Период
Сортировка

pg13dev: explain partial, parallel hashagg, and memory use

От
Justin Pryzby
Дата:
I'm testing with a customer's data on pg13dev and got output for which Peak
Memory doesn't look right/useful.  I reproduced it on 565f16902.

CREATE TABLE p(i int) PARTITION BY RANGE(i);
CREATE TABLE p1 PARTITION OF p FOR VALUES FROM (0)TO(1000);
CREATE TABLE p2 PARTITION OF p FOR VALUES FROM (1000)TO(2000);
CREATE TABLE p3 PARTITION OF p FOR VALUES FROM (2000)TO(3000);
INSERT INTO p SELECT i%3000 FROM generate_series(1,999999)i;
VACUUM ANALYZE p;

postgres=# explain(analyze,settings) SELECT i, COUNT(1) FROM p GROUP BY 1;
                                                              QUERY PLAN
              
 

---------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=7469.00..14214.45 rows=2502 width=12) (actual time=489.409..514.209 rows=3000 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=6469.00..12964.25 rows=1251 width=12) (actual time=476.291..477.179 rows=1000 loops=3)
         ->  HashAggregate  (cost=6487.99..6497.99 rows=1000 width=12) (actual time=474.454..475.203 rows=1000
loops=1)
               Group Key: p.i
               Batches: 1  Memory Usage: 0kB
               Worker 0:  Batches: 1  Memory Usage: 193kB
               Worker 1:  Batches: 1  Memory Usage: 0kB
               ->  Seq Scan on p1 p  (cost=0.00..4817.99 rows=333999 width=4) (actual time=0.084..100.677 rows=333999
loops=1)
         ->  HashAggregate  (cost=6469.00..6479.00 rows=1000 width=12) (actual time=468.517..469.272 rows=1000
loops=1)
               Group Key: p_1.i
               Batches: 1  Memory Usage: 0kB
               Worker 0:  Batches: 1  Memory Usage: 0kB
               Worker 1:  Batches: 1  Memory Usage: 193kB
               ->  Seq Scan on p2 p_1  (cost=0.00..4804.00 rows=333000 width=4) (actual time=0.082..102.154 rows=333000
loops=1)
         ->  HashAggregate  (cost=6469.00..6479.00 rows=1000 width=12) (actual time=485.887..486.509 rows=1000
loops=1)
               Group Key: p_2.i
               Batches: 1  Memory Usage: 193kB
               Worker 0:  Batches: 1  Memory Usage: 0kB
               Worker 1:  Batches: 1  Memory Usage: 0kB
               ->  Seq Scan on p3 p_2  (cost=0.00..4804.00 rows=333000 width=4) (actual time=0.043..104.631 rows=333000
loops=1)
 Settings: effective_io_concurrency = '0', enable_partitionwise_aggregate = 'on', enable_partitionwise_join = 'on',
work_mem= '127MB'
 

-- 
Justin



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
David Rowley
Дата:
On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I'm testing with a customer's data on pg13dev and got output for which Peak
> Memory doesn't look right/useful.  I reproduced it on 565f16902.

Likely the sanity of those results depends on whether you think that
the Memory Usage reported outside of the workers is meant to be the
sum of all processes or the memory usage for the leader backend.

All that's going on here is that the Parallel Append is using some
parallel safe paths and giving one to each worker. The 2 workers take
the first 2 subpaths and the leader takes the third.  The memory usage
reported helps confirm that's the case.

Can you explain what you'd want to see changed about this?   Or do you
want to see the non-parallel worker memory be the sum of all workers?
Sort does not seem to do that, so I'm not sure if we should consider
hash agg as an exception to that.

One thing I did notice from playing with this table is that Sort does
not show the memory used by the leader process when it didn't do any
of the work itself.

postgres=# set parallel_leader_participation =off;
SET
postgres=# explain analyze select i from p group by i;

               ->  Sort  (cost=59436.92..60686.92 rows=500000 width=4)
(actual time=246.836..280.985 rows=500000 loops=2)
                     Sort Key: p.i
                     Worker 0:  Sort Method: quicksort  Memory: 27898kB
                     Worker 1:  Sort Method: quicksort  Memory: 55842kB

Whereas with the leader helping out we get:

->  Sort  (cost=51284.39..52326.05 rows=416666 width=4) (actual
time=191.814..213.418 rows=333333 loops=3)
      Sort Key: p.i
      Sort Method: quicksort  Memory: 33009kB
      Worker 0:  Sort Method: quicksort  Memory: 25287kB
      Worker 1:  Sort Method: quicksort  Memory: 25445kB

Maybe we should do the same for hash agg when the leader didn't assist?

David



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
James Coleman
Дата:
On Tue, Aug 4, 2020 at 9:44 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > I'm testing with a customer's data on pg13dev and got output for which Peak
> > Memory doesn't look right/useful.  I reproduced it on 565f16902.
>
> Likely the sanity of those results depends on whether you think that
> the Memory Usage reported outside of the workers is meant to be the
> sum of all processes or the memory usage for the leader backend.
>
> All that's going on here is that the Parallel Append is using some
> parallel safe paths and giving one to each worker. The 2 workers take
> the first 2 subpaths and the leader takes the third.  The memory usage
> reported helps confirm that's the case.
>
> Can you explain what you'd want to see changed about this?   Or do you
> want to see the non-parallel worker memory be the sum of all workers?
> Sort does not seem to do that, so I'm not sure if we should consider
> hash agg as an exception to that.

I've always found the way we report parallel workers in EXPLAIN quite
confusing. I realize it matches the actual implementation model (the
leader often is also "another worker", but I think the natural
expectation from a user perspective would be that you'd show as
workers all backends (including the leader) that did work, and then
aggregate into a summary line (where the leader is displayed now).

In the current output there's nothing really to hint to the use that
the model is leader + workers and that the "summary" line is really
the leader. If I were to design this from scratch, I'd want to propose
doing what I said above (summary aggregate line + treat leader as a
worker line, likely with a "leader" tag), but that seems like a big
change to make now. On the other hand, perhaps designating what looks
like a summary line as the "leader" or some such would help clear up
the confusion? Perhaps it could also say "Participating" or
"Non-participating"?

James



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
Justin Pryzby
Дата:
On Wed, Aug 05, 2020 at 01:44:17PM +1200, David Rowley wrote:
> On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > I'm testing with a customer's data on pg13dev and got output for which Peak
> > Memory doesn't look right/useful.  I reproduced it on 565f16902.
> 
> Likely the sanity of those results depends on whether you think that
> the Memory Usage reported outside of the workers is meant to be the
> sum of all processes or the memory usage for the leader backend.
> 
> All that's going on here is that the Parallel Append is using some
> parallel safe paths and giving one to each worker. The 2 workers take
> the first 2 subpaths and the leader takes the third.  The memory usage
> reported helps confirm that's the case.

I'm not sure there's a problem, but all the 0kB were suspicious to me.  

I think you're saying that one worker alone handled each HashAgg, and the other
worker (and leader) show 0kB.  I guess in my naive thinking it's odd to show a
worker which wasn't active for that subpath (at least in text output).  But I
don't know the expected behavior of parallel hashagg, so that explains most of
my confusion. 

On Tue, Aug 04, 2020 at 10:01:18PM -0400, James Coleman wrote:
> Perhaps it could also say "Participating" or "Non-participating"?

Yes, that'd help me alot :)

Also odd (to me).  If I encourage more workers, there are "slots" for each
"planned" worker, even though fewer were launched:

postgres=# ALTER TABLE p3 SET (parallel_workers=11);
postgres=# SET max_parallel_workers_per_gather=11;
 Finalize HashAggregate  (cost=10299.64..10329.64 rows=3000 width=12) (actual time=297.793..299.933 rows=3000 loops=1)
   Group Key: p.i
   Batches: 1  Memory Usage: 625kB
   ->  Gather  (cost=2928.09..10134.64 rows=33000 width=12) (actual time=233.398..282.429 rows=13000 loops=1)
         Workers Planned: 11
         Workers Launched: 7
         ->  Parallel Append  (cost=1928.09..5834.64 rows=3000 width=12) (actual time=214.358..232.980 rows=1625
loops=8)
               ->  Partial HashAggregate  (cost=1933.46..1943.46 rows=1000 width=12) (actual time=167.936..171.345
rows=1000loops=4)
 
                     Group Key: p.i
                     Batches: 1  Memory Usage: 0kB
                     Worker 0:  Batches: 1  Memory Usage: 193kB
                     Worker 1:  Batches: 1  Memory Usage: 193kB
                     Worker 2:  Batches: 1  Memory Usage: 0kB
                     Worker 3:  Batches: 1  Memory Usage: 0kB
                     Worker 4:  Batches: 1  Memory Usage: 193kB
                     Worker 5:  Batches: 1  Memory Usage: 0kB
                     Worker 6:  Batches: 1  Memory Usage: 193kB
                     Worker 7:  Batches: 0  Memory Usage: 0kB
                     Worker 8:  Batches: 0  Memory Usage: 0kB
                     Worker 9:  Batches: 0  Memory Usage: 0kB
                     Worker 10:  Batches: 0  Memory Usage: 0kB


Thanks,
-- 
Justin



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
David Rowley
Дата:
On Wed, 5 Aug 2020 at 14:13, Justin Pryzby <pryzby@telsasoft.com> wrote:
> Also odd (to me).  If I encourage more workers, there are "slots" for each
> "planned" worker, even though fewer were launched:

Looking at explain.c for "num_workers; " (including the final space at
the end), looking at each for loop that loops over each worker, quite
a number of those locations have a condition that skips the worker.

For example, show_sort_info() does

if (sinstrument->sortMethod == SORT_TYPE_STILL_IN_PROGRESS)
continue; /* ignore any unfilled slots */

So maybe Hash Agg should be doing something similar.  Additionally,
maybe it should not show the leader details if the leader didn't help.

David



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
David Rowley
Дата:
On Wed, 5 Aug 2020 at 14:27, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 5 Aug 2020 at 14:13, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Also odd (to me).  If I encourage more workers, there are "slots" for each
> > "planned" worker, even though fewer were launched:
>
> Looking at explain.c for "num_workers; " (including the final space at
> the end), looking at each for loop that loops over each worker, quite
> a number of those locations have a condition that skips the worker.
>
> For example, show_sort_info() does
>
> if (sinstrument->sortMethod == SORT_TYPE_STILL_IN_PROGRESS)
> continue; /* ignore any unfilled slots */
>
> So maybe Hash Agg should be doing something similar.  Additionally,
> maybe it should not show the leader details if the leader didn't help.

Here's what I had in mind.

The unpatched format got even more broken with EXPLAIN (ANALYZE,
VERBOSE), so this is certainly a bug fix.

David

Вложения

Re: pg13dev: explain partial, parallel hashagg, and memory use

От
David Rowley
Дата:
On Wed, 5 Aug 2020 at 17:25, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 5 Aug 2020 at 14:27, David Rowley <dgrowleyml@gmail.com> wrote:
> > So maybe Hash Agg should be doing something similar.  Additionally,
> > maybe it should not show the leader details if the leader didn't help.
>
> Here's what I had in mind.

Just coming back to this. I'd like to push it soon, but it's currently
late here. I'll look at pushing it in my morning in about 8 hours
time.

If anyone has any comments please let me know before then.

David



Re: pg13dev: explain partial, parallel hashagg, and memory use

От
David Rowley
Дата:
On Fri, 7 Aug 2020 at 00:44, David Rowley <dgrowleyml@gmail.com> wrote:
> Just coming back to this. I'd like to push it soon, but it's currently
> late here. I'll look at pushing it in my morning in about 8 hours
> time.

Pushed.

David