Re: Parallel Foreign Scans - need advice

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Parallel Foreign Scans - need advice
Дата
Msg-id CA+hUKGKiAFMm8rp7oy0OUj8q9BY9AXwEuB8y=s885+ekfXJEMg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Foreign Scans - need advice  (Korry Douglas <korry@me.com>)
Список pgsql-hackers
On Fri, May 17, 2019 at 12:45 AM Korry Douglas <korry@me.com> wrote:
> It sounds like there is no reliable way to get the information that I’m looking for, is that right?

Correct.  And if there were, it could only be used to write bugs.  Let
me see if I can demonstrate...  I'll use the file_fdw patch from the
link I gave before, and I'll add an elog(LOG) message to show when
fileIterateForeignScan() runs.

$ echo 1 > /tmp/t2

postgres=# create table t1 as select generate_series(1, 1000000)::int i;
SELECT 1000000
postgres=# create server files foreign data wrapper file_fdw;
CREATE SERVER
postgres=# create foreign table t2 (n int) server files
  options (filename '/tmp/t2', format 'csv');
CREATE FOREIGN TABLE

The relevant EXPLAIN output is harder to understand if the parallel
leader participates, but it changes nothing important, so I'll turn
that off first, and then see how it is run:

postgres=# set parallel_leader_participation = off;
SET
postgres=# explain (analyze, verbose) select count(*) from (select *
from t1 union all select * from t2) ss;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=14176.32..14176.33 rows=1 width=8) (actual
time=234.023..234.023 rows=1 loops=1)
   Output: count(*)
   ->  Gather  (cost=14176.10..14176.31 rows=2 width=8) (actual
time=233.840..235.079 rows=2 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=13176.10..13176.11 rows=1
width=8) (actual time=223.550..223.555 rows=1 loops=2)
               Output: PARTIAL count(*)
               Worker 0: actual time=223.432..223.443 rows=1 loops=1
               Worker 1: actual time=223.667..223.668 rows=1 loops=1
               ->  Parallel Append  (cost=0.00..11926.10 rows=500000
width=0) (actual time=0.087..166.669 rows=500000 loops=2)
                     Worker 0: actual time=0.083..166.366 rows=499687 loops=1
                     Worker 1: actual time=0.092..166.972 rows=500314 loops=1
                     ->  Parallel Seq Scan on public.t1
(cost=0.00..9425.00 rows=500000 width=0) (actual time=0.106..103.384
rows=500000 loops=2)
                           Worker 0: actual time=0.123..103.106
rows=499686 loops=1
                           Worker 1: actual time=0.089..103.662
rows=500314 loops=1
                     ->  Parallel Foreign Scan on public.t2
(cost=0.00..1.10 rows=1 width=0) (actual time=0.079..0.096 rows=1
loops=1)
                           Foreign File: /tmp/numbers
                           Foreign File Size: 2 b
                           Worker 0: actual time=0.079..0.096 rows=1 loops=1
 Planning Time: 0.219 ms
 Execution Time: 235.262 ms
(22 rows)

You can see the that Parallel Foreign Scan was only actually run by
one worker.  So if you were somehow expecting both of them to show up
in order to produce the correct results, you have a bug.  The reason
that happened is because Parallal Append sent one worker to chew on
t1, and another to chew on t2, but the scan of t2 was finished very
quickly, so that worker then went to help out with t1.  And for
further proof of that, here's what I see in my server log (note only
ever called twice, and in the same process):

2019-05-17 10:51:42.248 NZST [52158] LOG:  fileIterateForeignScan
2019-05-17 10:51:42.248 NZST [52158] STATEMENT:  explain analyze
select count(*) from (select * from t1 union all select * from t2) ss;
2019-05-17 10:51:42.249 NZST [52158] LOG:  fileIterateForeignScan
2019-05-17 10:51:42.249 NZST [52158] STATEMENT:  explain analyze
select count(*) from (select * from t1 union all select * from t2) ss;

Therefore you can't allocate the work up front based on expected
number of workers, even if it works in simple examples.  Your node
isn't necessarily the only node in the plan, and higher up nodes get
to decide when, if at all, you run, in each worker.

--
Thomas Munro
https://enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weird stats
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weirdstats