Обсуждение: Number of parallel workers chosen by the optimizer for parallel append

Поиск
Список
Период
Сортировка

Number of parallel workers chosen by the optimizer for parallel append

От
Laurenz Albe
Дата:
I have a partitioned table, each partition has "parallel_workers = 10" set.

  SET max_parallel_workers_per_gather = 8;

  SET enable_partitionwise_aggregate = on;

  EXPLAIN (COSTS OFF)
  SELECT applicant_name, count(ipc_4)
  FROM laurenz.z_flat
  GROUP BY applicant_name;

                      QUERY PLAN                    
  --------------------------------------------------
   Gather
     Workers Planned: 4
     ->  Parallel Append
           ->  HashAggregate
                 Group Key: z_flat_3.applicant_name
                 ->  Seq Scan on xyz_4 z_flat_3
           ->  HashAggregate
                 Group Key: z_flat.applicant_name
                 ->  Seq Scan on xyz_1 z_flat
           [8 more such partition scans]
  (33 rows)

How does the optimizer decide to use 4 parallel workers?

No matter what I try, I cannot influence that number.

Yours,
Laurenz Albe




Re: Number of parallel workers chosen by the optimizer for parallel append

От
Michael Lewis
Дата:
What have you tried? Changing the relevant cost parameters I assume? Nothing else going on that may be taking up those workers, right?

Re: Number of parallel workers chosen by the optimizer for parallel append

От
Laurenz Albe
Дата:
On Wed, 2020-11-25 at 17:36 +0100, Laurenz Albe wrote:
> I have a partitioned table, each partition has "parallel_workers = 10" set.
> 
>   SET max_parallel_workers_per_gather = 8;
> 
>   SET enable_partitionwise_aggregate = on;
> 
>   EXPLAIN (COSTS OFF)
>   SELECT applicant_name, count(ipc_4)
>   FROM laurenz.z_flat
>   GROUP BY applicant_name;
> 
>                       QUERY PLAN                    
>   --------------------------------------------------
>    Gather
>      Workers Planned: 4
>      ->  Parallel Append
>            ->  HashAggregate
>                  Group Key: z_flat_3.applicant_name
>                  ->  Seq Scan on xyz_4 z_flat_3
>            ->  HashAggregate
>                  Group Key: z_flat.applicant_name
>                  ->  Seq Scan on xyz_1 z_flat
>            [8 more such partition scans]
>   (33 rows)
> 
> How does the optimizer decide to use 4 parallel workers?
> 
> No matter what I try, I cannot influence that number.

I figured it out.

This is automatically calculated from the number of partitions, and the
number of parallel workers is

  ld(#partitions) + 1

where "ld" is the logarithm of base 2 (function "fls" in the source).

It might be nice to make this configurable, but since we don't have
storage parameters on partitioned tables, I wonder how.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com