Re: EXPLAIN: Non-parallel ancestor plan nodes exclude parallel worker instrumentation

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: EXPLAIN: Non-parallel ancestor plan nodes exclude parallel worker instrumentation
Дата
Msg-id CAA4eK1J7129a5rDeZx3iBr5rv1YM468yT9Wqvu65z5L8wdT6OA@mail.gmail.com
обсуждение исходный текст
Ответ на EXPLAIN: Non-parallel ancestor plan nodes exclude parallel worker instrumentation  (Maciek Sakrejda <m.sakrejda@gmail.com>)
Ответы Re: EXPLAIN: Non-parallel ancestor plan nodes exclude parallel worker instrumentation  (Maciek Sakrejda <m.sakrejda@gmail.com>)
Список pgsql-hackers
On Tue, Jun 23, 2020 at 12:55 AM Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
>
> Hello,
>
> I had some questions about the behavior of some accounting in parallel
> EXPLAIN plans. Take the following plan:
>
> ```
> Gather  (cost=1000.43..750173.74 rows=2 width=235) (actual
> time=1665.122..1665.122 rows=0 loops=1)
>   Workers Planned: 2
>   Workers Launched: 2
>   Buffers: shared hit=27683 read=239573
>   I/O Timings: read=687.358
>   ->  Nested Loop  (cost=0.43..749173.54 rows=1 width=235) (actual
> time=1660.095..1660.095 rows=0 loops=3)
>         Inner Unique: true
>         Buffers: shared hit=77579 read=657847
>         I/O Timings: read=2090.189
..
> ```
>
> The Nested Loop here aggregates data for metrics like `buffers read`
> from its workers, and to calculate a metric like `buffers read` for
> the parallel leader, we can subtract the values recorded in each
> individual worker. This happens in the Seq Scan and Index Scan
> children, as well. However, the Gather node appears to only include
> values from its direct parallel leader child (excluding that child's
> workers).
>
> This leads to the odd situation that the Gather has lower values for
> some of these metrics than its child (because the child node reporting
> includes the worker metrics) even though the values are supposed to be
> cumulative.
>

I don't think this is an odd situation because in this case, child
nodes like "Nested Loop" and "Parallel Seq Scan" has a value of
'loops' as 3.  So, to get the correct stats at those nodes, you need
to divide it by 3 whereas, at Gather node, the value of 'loops' is 1.
If you want to verify this thing then try with a plan where loops
should be 1 for child nodes as well, you should get the same value at
both Gather and Parallel Seq Scan nodes.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: unescape_text function
Следующее
От: vignesh C
Дата:
Сообщение: Re: [PATCH] Initial progress reporting for COPY command