Re: Parallel Seq Scan
От | Thom Brown |
---|---|
Тема | Re: Parallel Seq Scan |
Дата | |
Msg-id | CAA-aLv6JMAsDOg7R6DzvcWgLCSukGK_Ap4gRfiC+1NgWaqHAVw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Parallel Seq Scan
(Thom Brown <thom@linux.com>)
Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On 25 March 2015 at 10:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Mar 20, 2015 at 5:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> So the patches have to be applied in below sequence:
> HEAD Commit-id : 8d1f2390
> parallel-mode-v8.1.patch [2]
> assess-parallel-safety-v4.patch [1]
> parallel-heap-scan.patch [3]
> parallel_seqscan_v11.patch (Attached with this mail)
>
> The reason for not using the latest commit in HEAD is that latest
> version of assess-parallel-safety patch was not getting applied,
> so I generated the patch at commit-id where I could apply that
> patch successfully.
>
> [1] - http://www.postgresql.org/message-id/CA+TgmobJSuefiPOk6+i9WERUgeAB3ggJv7JxLX+r6S5SYydBRQ@mail.gmail.com
> [2] - http://www.postgresql.org/message-id/CA+TgmoZJjzYnpXChL3gr7NwRUzkAzPMPVKAtDt5sHvC5Cd7RKw@mail.gmail.com
> [3] - http://www.postgresql.org/message-id/CA+TgmoYJETgeAXUsZROnA7BdtWzPtqExPJNTV1GKcaVMgSdhug@mail.gmail.com
>Fixed the reported issue on assess-parallel-safety thread and anotherbug caught while testing joins and integrated with latest version ofparallel-mode patch (parallel-mode-v9 patch).Apart from that I have moved the Initialization of dsm segement fromInitNode phase to ExecFunnel() (on first execution) as per suggestionfrom Robert. The main idea is that as it creates large shared memorysegment, so do the work when it is really required.HEAD Commit-Id: 11226e38parallel-mode-v9.patch [2]
assess-parallel-safety-v4.patch [1]parallel-heap-scan.patch [3]parallel_seqscan_v12.patch (Attached with this mail)[1] - http://www.postgresql.org/message-id/CA+TgmobJSuefiPOk6+i9WERUgeAB3ggJv7JxLX+r6S5SYydBRQ@mail.gmail.com
[2] - http://www.postgresql.org/message-id/CA+TgmoZfSXZhS6qy4Z0786D7iU_AbhBVPQFwLthpSvGieczqHg@mail.gmail.com
[3] - http://www.postgresql.org/message-id/CA+TgmoYJETgeAXUsZROnA7BdtWzPtqExPJNTV1GKcaVMgSdhug@mail.gmail.com
Okay, with my pgbench_accounts partitioned into 300, I ran:
SELECT DISTINCT bid FROM pgbench_accounts;
The query never returns, and I also get this:
grep -r 'starting background worker process "parallel worker for PID 12165"' postgresql-2015-03-25_112522.log | wc -l
2496
grep -r 'starting background worker process "parallel worker for PID 12165"' postgresql-2015-03-25_112522.log | wc -l
2496
2,496 workers? This is with parallel_seqscan_degree set to 8. If I set it to 2, this number goes down to 626, and with 16, goes up to 4320.
Here's the query plan:
QUERY PLAN
---------------------------------------------------------------------------------------------------------
HashAggregate (cost=38856527.50..38856529.50 rows=200 width=4)
Group Key: pgbench_accounts.bid
-> Append (cost=0.00..38806370.00 rows=20063001 width=4)
-> Seq Scan on pgbench_accounts (cost=0.00..0.00 rows=1 width=4)
-> Funnel on pgbench_accounts_1 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_1 (cost=0.00..1641000.00 rows=100000 width=4)
-> Funnel on pgbench_accounts_2 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_2 (cost=0.00..1641000.00 rows=100000 width=4)
-> Funnel on pgbench_accounts_3 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
...
-> Partial Seq Scan on pgbench_accounts_498 (cost=0.00..10002.10 rows=210 width=4)
-> Funnel on pgbench_accounts_499 (cost=0.00..1132.34 rows=210 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_499 (cost=0.00..10002.10 rows=210 width=4)
-> Funnel on pgbench_accounts_500 (cost=0.00..1132.34 rows=210 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_500 (cost=0.00..10002.10 rows=210 width=4)
QUERY PLAN
---------------------------------------------------------------------------------------------------------
HashAggregate (cost=38856527.50..38856529.50 rows=200 width=4)
Group Key: pgbench_accounts.bid
-> Append (cost=0.00..38806370.00 rows=20063001 width=4)
-> Seq Scan on pgbench_accounts (cost=0.00..0.00 rows=1 width=4)
-> Funnel on pgbench_accounts_1 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_1 (cost=0.00..1641000.00 rows=100000 width=4)
-> Funnel on pgbench_accounts_2 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_2 (cost=0.00..1641000.00 rows=100000 width=4)
-> Funnel on pgbench_accounts_3 (cost=0.00..192333.33 rows=100000 width=4)
Number of Workers: 8
...
-> Partial Seq Scan on pgbench_accounts_498 (cost=0.00..10002.10 rows=210 width=4)
-> Funnel on pgbench_accounts_499 (cost=0.00..1132.34 rows=210 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_499 (cost=0.00..10002.10 rows=210 width=4)
-> Funnel on pgbench_accounts_500 (cost=0.00..1132.34 rows=210 width=4)
Number of Workers: 8
-> Partial Seq Scan on pgbench_accounts_500 (cost=0.00..10002.10 rows=210 width=4)
Still not sure why 8 workers are needed for each partial scan. I would expect 8 workers to be used for 8 separate scans. Perhaps this is just my misunderstanding of how this feature works.
--
Thom
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Shigeru HANADAДата:
Сообщение: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)