Re: Instability of partition_prune regression test results

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Instability of partition_prune regression test results
Дата
Msg-id 11381.1569599943@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Instability of partition_prune regression test results  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: Instability of partition_prune regression test results  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
Amit Langote <amitlangote09@gmail.com> writes:
> On Fri, Sep 27, 2019 at 7:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I experimented with adjusting explain_parallel_append() to filter
>> more fields, but soon realized that we'd have to filter out basically
>> everything that makes it useful to run EXPLAIN ANALYZE at all.
>> Therefore, I think it's time to give up this testing methodology
>> as a bad idea, and fall back to the time-honored way of running a
>> plain EXPLAIN and then the actual query, as per the attached patch.

> Isn't the point of using ANALYZE here to show that the exec-param
> based run-time pruning is working (those "never executed" strings)?

Hm.  Well, if you want to see those, we could do it as attached.

            regards, tom lane

diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 82b68e7..12d6dfc 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1990,9 +1990,14 @@ explain (analyze, costs off, summary off, timing off) select * from list_part wh
 rollback;
 drop table list_part;
 -- Parallel append
--- Suppress the number of loops each parallel node runs for.  This is because
--- more than one worker may run the same parallel node if timing conditions
--- are just right, which destabilizes the test.
+-- Parallel queries won't necessarily get as many workers as the planner
+-- asked for.  This affects not only the "Workers Launched:" field of EXPLAIN
+-- results, but also row counts and loop counts for parallel scans, Gathers,
+-- and everything in between.  This function filters out the values we can't
+-- rely on to be stable.
+-- This removes enough info that you might wonder why bother with EXPLAIN
+-- ANALYZE at all.  The answer is that we need to see '(never executed)'
+-- notations because that's the only way to verify runtime pruning.
 create function explain_parallel_append(text) returns setof text
 language plpgsql as
 $$
@@ -2003,9 +2008,8 @@ begin
         execute format('explain (analyze, costs off, summary off, timing off) %s',
             $1)
     loop
-        if ln like '%Parallel%' then
-            ln := regexp_replace(ln, 'loops=\d*',  'loops=N');
-        end if;
+        ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
+        ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
         return next ln;
     end loop;
 end;
@@ -2052,18 +2056,18 @@ execute ab_q4 (1, 8);
 select explain_parallel_append('execute ab_q4 (2, 2)');
                             explain_parallel_append
 -------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=3 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 2
-         Workers Launched: 2
-         ->  Partial Aggregate (actual rows=1 loops=3)
-               ->  Parallel Append (actual rows=0 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Parallel Append (actual rows=N loops=N)
                      Subplans Removed: 6
-                     ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b1 (actual rows=N loops=N)
                            Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
-                     ->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b2 (actual rows=N loops=N)
                            Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
-                     ->  Parallel Seq Scan on ab_a2_b3 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b3 (actual rows=N loops=N)
                            Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
 (13 rows)

@@ -2105,42 +2109,42 @@ execute ab_q5 (1, 2, 3);
 select explain_parallel_append('execute ab_q5 (1, 1, 1)');
                             explain_parallel_append
 -------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=3 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 2
-         Workers Launched: 2
-         ->  Partial Aggregate (actual rows=1 loops=3)
-               ->  Parallel Append (actual rows=0 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Parallel Append (actual rows=N loops=N)
                      Subplans Removed: 6
-                     ->  Parallel Seq Scan on ab_a1_b1 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a1_b1 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a1_b2 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a1_b2 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a1_b3 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a1_b3 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
 (13 rows)

 select explain_parallel_append('execute ab_q5 (2, 3, 3)');
                             explain_parallel_append
 -------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=3 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 2
-         Workers Launched: 2
-         ->  Partial Aggregate (actual rows=1 loops=3)
-               ->  Parallel Append (actual rows=0 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Parallel Append (actual rows=N loops=N)
                      Subplans Removed: 3
-                     ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b1 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b2 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a2_b3 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a2_b3 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a3_b1 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a3_b1 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a3_b2 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a3_b2 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
-                     ->  Parallel Seq Scan on ab_a3_b3 (actual rows=0 loops=N)
+                     ->  Parallel Seq Scan on ab_a3_b3 (actual rows=N loops=N)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
 (19 rows)

@@ -2149,12 +2153,12 @@ select explain_parallel_append('execute ab_q5 (2, 3, 3)');
 select explain_parallel_append('execute ab_q5 (33, 44, 55)');
                             explain_parallel_append
 -------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=3 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 2
-         Workers Launched: 2
-         ->  Partial Aggregate (actual rows=1 loops=3)
-               ->  Parallel Append (actual rows=0 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Parallel Append (actual rows=N loops=N)
                      Subplans Removed: 8
                      ->  Parallel Seq Scan on ab_a1_b1 (never executed)
                            Filter: ((b < 4) AND (a = ANY (ARRAY[$1, $2, $3])))
@@ -2164,21 +2168,21 @@ select explain_parallel_append('execute ab_q5 (33, 44, 55)');
 select explain_parallel_append('select count(*) from ab where (a = (select 1) or a = (select 3)) and b = 2');
                          explain_parallel_append
 -------------------------------------------------------------------------
- Aggregate (actual rows=1 loops=1)
+ Aggregate (actual rows=N loops=N)
    InitPlan 1 (returns $0)
-     ->  Result (actual rows=1 loops=1)
+     ->  Result (actual rows=N loops=N)
    InitPlan 2 (returns $1)
-     ->  Result (actual rows=1 loops=1)
-   ->  Gather (actual rows=0 loops=1)
+     ->  Result (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 2
          Params Evaluated: $0, $1
-         Workers Launched: 2
-         ->  Parallel Append (actual rows=0 loops=N)
-               ->  Parallel Seq Scan on ab_a1_b2 (actual rows=0 loops=N)
+         Workers Launched: N
+         ->  Parallel Append (actual rows=N loops=N)
+               ->  Parallel Seq Scan on ab_a1_b2 (actual rows=N loops=N)
                      Filter: ((b = 2) AND ((a = $0) OR (a = $1)))
                ->  Parallel Seq Scan on ab_a2_b2 (never executed)
                      Filter: ((b = 2) AND ((a = $0) OR (a = $1)))
-               ->  Parallel Seq Scan on ab_a3_b2 (actual rows=0 loops=N)
+               ->  Parallel Seq Scan on ab_a3_b2 (actual rows=N loops=N)
                      Filter: ((b = 2) AND ((a = $0) OR (a = $1)))
 (16 rows)

@@ -2203,20 +2207,20 @@ set enable_mergejoin = 0;
 select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(0, 0, 1)');
                                       explain_parallel_append
 ---------------------------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=2 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 1
-         Workers Launched: 1
-         ->  Partial Aggregate (actual rows=1 loops=2)
-               ->  Nested Loop (actual rows=0 loops=2)
-                     ->  Parallel Seq Scan on lprt_a a (actual rows=51 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Nested Loop (actual rows=N loops=N)
+                     ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)
                            Filter: (a = ANY ('{0,0,1}'::integer[]))
-                     ->  Append (actual rows=0 loops=102)
-                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=0 loops=2)
+                     ->  Append (actual rows=N loops=N)
+                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 (never executed)
                                  Index Cond: (a = a.a)
@@ -2237,20 +2241,20 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
 select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a + 0 where a.a in(0, 0,
1)');
                                       explain_parallel_append
 ---------------------------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=2 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 1
-         Workers Launched: 1
-         ->  Partial Aggregate (actual rows=1 loops=2)
-               ->  Nested Loop (actual rows=0 loops=2)
-                     ->  Parallel Seq Scan on lprt_a a (actual rows=51 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Nested Loop (actual rows=N loops=N)
+                     ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)
                            Filter: (a = ANY ('{0,0,1}'::integer[]))
-                     ->  Append (actual rows=0 loops=102)
-                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=0 loops=2)
+                     ->  Append (actual rows=N loops=N)
+                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
-                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
-                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 (never executed)
                                  Index Cond: (a = (a.a + 0))
@@ -2270,20 +2274,20 @@ insert into lprt_a values(3),(3);
 select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 3)');
                                       explain_parallel_append
 ---------------------------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=2 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 1
-         Workers Launched: 1
-         ->  Partial Aggregate (actual rows=1 loops=2)
-               ->  Nested Loop (actual rows=0 loops=2)
-                     ->  Parallel Seq Scan on lprt_a a (actual rows=52 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Nested Loop (actual rows=N loops=N)
+                     ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)
                            Filter: (a = ANY ('{1,0,3}'::integer[]))
-                     ->  Append (actual rows=0 loops=104)
-                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=0 loops=2)
+                     ->  Append (actual rows=N loops=N)
+                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 (never executed)
                                  Index Cond: (a = a.a)
@@ -2291,32 +2295,32 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                                  Index Cond: (a = a.a)
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 (never executed)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
 (27 rows)

 select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
                                       explain_parallel_append
 ---------------------------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=2 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 1
-         Workers Launched: 1
-         ->  Partial Aggregate (actual rows=1 loops=2)
-               ->  Nested Loop (actual rows=0 loops=2)
-                     ->  Parallel Seq Scan on lprt_a a (actual rows=51 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Nested Loop (actual rows=N loops=N)
+                     ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)
                            Filter: (a = ANY ('{1,0,0}'::integer[]))
                            Rows Removed by Filter: 1
-                     ->  Append (actual rows=0 loops=102)
-                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=0 loops=2)
+                     ->  Append (actual rows=N loops=N)
+                           ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=0 loops=2)
+                           ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 (never executed)
                                  Index Cond: (a = a.a)
@@ -2336,16 +2340,16 @@ delete from lprt_a where a = 1;
 select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
                                   explain_parallel_append
 --------------------------------------------------------------------------------------------
- Finalize Aggregate (actual rows=1 loops=1)
-   ->  Gather (actual rows=2 loops=1)
+ Finalize Aggregate (actual rows=N loops=N)
+   ->  Gather (actual rows=N loops=N)
          Workers Planned: 1
-         Workers Launched: 1
-         ->  Partial Aggregate (actual rows=1 loops=2)
-               ->  Nested Loop (actual rows=0 loops=2)
-                     ->  Parallel Seq Scan on lprt_a a (actual rows=50 loops=N)
+         Workers Launched: N
+         ->  Partial Aggregate (actual rows=N loops=N)
+               ->  Nested Loop (actual rows=N loops=N)
+                     ->  Parallel Seq Scan on lprt_a a (actual rows=N loops=N)
                            Filter: (a = ANY ('{1,0,0}'::integer[]))
                            Rows Removed by Filter: 1
-                     ->  Append (actual rows=0 loops=100)
+                     ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 (never executed)
                                  Index Cond: (a = a.a)
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 (never executed)
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 446af3b..8a0e723 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -451,9 +451,14 @@ drop table list_part;

 -- Parallel append

--- Suppress the number of loops each parallel node runs for.  This is because
--- more than one worker may run the same parallel node if timing conditions
--- are just right, which destabilizes the test.
+-- Parallel queries won't necessarily get as many workers as the planner
+-- asked for.  This affects not only the "Workers Launched:" field of EXPLAIN
+-- results, but also row counts and loop counts for parallel scans, Gathers,
+-- and everything in between.  This function filters out the values we can't
+-- rely on to be stable.
+-- This removes enough info that you might wonder why bother with EXPLAIN
+-- ANALYZE at all.  The answer is that we need to see '(never executed)'
+-- notations because that's the only way to verify runtime pruning.
 create function explain_parallel_append(text) returns setof text
 language plpgsql as
 $$
@@ -464,9 +469,8 @@ begin
         execute format('explain (analyze, costs off, summary off, timing off) %s',
             $1)
     loop
-        if ln like '%Parallel%' then
-            ln := regexp_replace(ln, 'loops=\d*',  'loops=N');
-        end if;
+        ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
+        ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
         return next ln;
     end loop;
 end;

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Standby accepts recovery_target_timeline setting?
Следующее
От: Asif Rehman
Дата:
Сообщение: Re: WIP/PoC for parallel backup