Re: [PoC] Asynchronous execution again (which is not parallel)

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [PoC] Asynchronous execution again (which is not parallel)
Дата
Msg-id 20151202.111522.61917802.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [PoC] Asynchronous execution again (which is not parallel)  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [PoC] Asynchronous execution again (which is not parallel)  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Thank you for picking this up.

At Tue, 1 Dec 2015 20:33:02 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
<CAA4eK1LBwj7heY8pxRmMCOLhuMFr81TLHck-+ByBFuUADgeu+A@mail.gmail.com>
> On Mon, Nov 30, 2015 at 6:17 PM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > ====== TODO or random thoughts, not restricted on this patch.
> >
> > - This patch doesn't contain planner part, it must be aware of
> >   async execution in order that this can be  in effective.
> >
> 
> How will you decide whether sync-execution is cheaper than parallel
> execution.  Do you have some specific cases in mind where async
> execution will be more useful than parallel execution?

Mmm.. Some confusion in wording? Sync-async is a discrimination
about when to start execution of a node (and its
descendents). Parallel-serial(sequential) is that of whether
multiple nodes can execute simultaneously. Async execution
premises parallel execution in any terms, bgworker or FDW.

As I wrote in the previous mail, async execution reduces startup
time of execution of parallel execution. So async execution is
not useful than parallel execution, but it accelerates parallel
execution. Is is effective when startup time of every parallel
execution node is rather long. We have enough numbers to cost it.

> > - Some measture to control execution on bgworker would be
> >   needed. At least merge join requires position mark/reset
> >   functions.
> >
> > - Currently, more tuples make reduce effectiveness of parallel
> >   execution, some method to transfer tuples in larger unit would
> >   be needed, or would be good to have shared workmem?
> >
> 
> Yeah, I think here one thing we need to figure out is whether the
> performance bottleneck is due to the amount of data that is transferred
> between worker and master or something else. One idea could be to pass
> TID and may be keep the buffer pin (which will be released by master
> backend), but on the other hand if we have to perform costly target list
> evaluation by bgworker, then it might be beneficial to pass the projected
> list back.

On possible bottle neck is singnalling between backends. Current
parallel execution uses signal to make producer-consumer world go
round. Conveying TID won't make it faster if the bottleneck is
the inter-process communication. I brought up bulk-transferring
or shared workmem as a example menas to reduce IPC frequency.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Foreign join pushdown vs EvalPlanQual
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.