Re: parallel joins, and better parallel explain

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: parallel joins, and better parallel explain
Дата
Msg-id CAA4eK1LDnmGbjrkijCCR-w1x20MNGwZ2WyQviK-XSRDuPonceQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: parallel joins, and better parallel explain  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: parallel joins, and better parallel explain  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, Dec 16, 2015 at 9:55 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Dec 16, 2015 at 6:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

>On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

> In any case,
>I have done some more investigation of the patch and found that even
>without changing query planner related parameters, it seems to give
>bad plans (as in example below [1]).  I think here the costing of rework each

I have done some more testing using TPC-H benchmark (For some of the queries, specially for Parallel Hash Join), and Results summary is as below.


Planning Time(ms)
QueryBasePatch
TPC-H Q22.22.4
TPCH- Q30.670.71
TPCH- Q53.172.3
TPCH- Q72.432.4




Execution Time(ms)
QueryBasePatch
TPC-H Q22826766
TPCH- Q32347324271
TPCH- Q5213571432
TPCH- Q767791138

All Test files and Detail plan output is attached in mail
q2.sql, q3.sql, q.5.sql ans q7.sql are TPCH benchmark' 2nd, 3rd, 5th and 7th query
and Results with base and Parallel join are attached in q*_base.out and q*_parallel.out respectively.

Summary: With TPC-H queries where ever Hash Join is pushed under gather Node, significant improvement is visible,
with Q2, using 3 workers, time consumed is almost 1/3 of the base.


I Observed one problem, with Q5 and Q7, there some relation and snapshot references are leaked and i am getting below warning, havn't yet looked into the issue.


While looking at plans of Q5 and Q7, I have observed that Gather is
pushed below another Gather node for which we don't have appropriate
way of dealing.  I think that could be the reason why you are seeing
the errors.

Also, I think it would be good if you can once check the plan/execution
time with max_parallel_degree=0 as that can give us base reference
data without parallelism, also I am wondering if have you have changed
any other parallel cost related parameter?


With Regards,
Amit Kapila.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: extend pgbench expressions with functions
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Making tab-complete.c easier to maintain