Re: Parallel Seq Scan

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Parallel Seq Scan
Дата	7 мая 2015 г. 10:23:23
Msg-id	CAA4eK1Lx+-SsWzo4hxhjB_n9rrVnt7YXNwWmP+DApJeJcot3vA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, May 6, 2015 at 7:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, May 6, 2015 at 7:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>> - I believe the separation of concerns between ExecFunnel() and
> >>> ExecEndFunnel() is not quite right. If the scan is shut down before
> >>> it runs to completion (e.g. because of LIMIT), then I think we'll call
> >>> ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path. I
> >>> think you probably need to create a static subroutine that is called
> >>> both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each
> >>> case cleaning up whatever resources remain.
> >
> >> Right, will fix as per suggestion.
> >
> > I observed one issue while working on this review comment. When we
> > try to destroy the parallel setup via ExecEndNode (as due to Limit
> > Node, it could not destroy after consuming all tuples), it waits for
> > parallel
> > workers to finish (WaitForParallelWorkersToFinish()) and parallel workers
> > are waiting for master backend to signal them as their queue is full.
> > I think in such a case master backend needs to inform workers either when
> > the scan is discontinued due to limit node or while waiting for parallel
> > workers to finish.
>
> Isn't this why TupleQueueFunnelShutdown() calls shm_mq_detach()?
> That's supposed to unstick the workers; any impending or future writes
> will just return SHM_MQ_DETACHED without waiting.
>

Okay, that can work if we call it in ExecEndNode() before

WaitForParallelWorkersToFinish(), however what if we want to do something

like TupleQueueFunnelShutdown() when Limit node decides to stop processing

the outer node. We can traverse the whole plan tree and find the nodes where

parallel workers needs to be stopped, but I don't think thats good way to handle

it. If we don't want to stop workers from processing until

ExecutorEnd()--->ExecEndNode(), then it will lead to workers continuing till

that time and it won't be easy to get instrumentation/buffer usage information

from workers (workers fill such information for master backend after execution

is complete) as that is done before ExecutorEnd(). For Explain Analyze .., we

can ensure that workers are stopped before fetching that information from

Funnel node, but the same is not easy for buffer usage stats required by

plugins as that operates at ExecutorRun() and ExecutorFinish() level where

we don't have direct access to node level information. You can refer

pgss_ExecutorEnd() where it completes the storage of stats information

before calling ExecutorEnd(). Offhand, I could not think of a good way to

do this, but one crude way could be introduce a new API (ParallelExecutorEnd())

for such plugins which needs to be called before completing the stats accumulation.

This API will call ExecEndPlan() if parallelmodeNeeded flag is set and allow

accumulation of stats (InstrStartNode()/InstrStopNode())

>
> Well, all the workers restore that state in parallel, so adding it up
> across all workers doesn't really make sense. But anyway, no, I don't
> think that's a big cost. I think the big cost is going to the
> operating system overhead of process creation. The new process will
> incur lots of page faults as it populates its address space and
> dirties pages marked copy-on-write. That's where I expect most of the
> expense to be.
>

Okay, will remove parallel_startup_cost from patch in next version.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Noah Misch
Дата: 07 мая 2015 г., 10:10:08
Сообщение: Re: parallel mode and parallel contexts

Следующее

От: Geoff Winkless
Дата: 07 мая 2015 г., 12:01:48
Сообщение: Re: INSERT ... ON CONFLICT syntax issues

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel Seq Scan

Предыдущее

Следующее