Re: Parallel Seq Scan

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Parallel Seq Scan
Дата	6 мая 2015 г. 14:55:12
Msg-id	CAA4eK1+h7XxGBucabevTUNif49mvaZ=9yLCk-9Vu0q0cUFHNMQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Apr 28, 2015 at 5:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Apr 24, 2015 at 8:32 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

>> - I believe the separation of concerns between ExecFunnel() and
>> ExecEndFunnel() is not quite right. If the scan is shut down before
>> it runs to completion (e.g. because of LIMIT), then I think we'll call
>> ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path. I
>> think you probably need to create a static subroutine that is called
>> both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each
>> case cleaning up whatever resources remain.
>>

> Right, will fix as per suggestion.

I observed one issue while working on this review comment. When we

try to destroy the parallel setup via ExecEndNode (as due to Limit

Node, it could not destroy after consuming all tuples), it waits for parallel

workers to finish (WaitForParallelWorkersToFinish()) and parallel workers

are waiting for master backend to signal them as their queue is full.

I think in such a case master backend needs to inform workers either when

the scan is discontinued due to limit node or while waiting for parallel

workers to finish.

> >> - I don't think you need both setup cost and startup cost. Starting
> >> up more workers isn't particularly more expensive than starting up
> >> fewer of them, because most of the overhead is in waiting for them to
> >> actually start, and the number of workers is reasonable, then they're
> >> all be doing that in parallel with each other. I suggest removing
> >> parallel_startup_cost and keeping parallel_setup_cost.
> >
> > There is some work (like creation of shm queues, launching of workers)
> > which is done proportional to number of workers during setup time. I
> > have kept 2 parameters to distinguish such work. I think you have a
> > point that start of some or all workers could be parallel, but I feel
> > that still is a work proportinal to number of workers. For future
> > parallel operations also such a parameter could be useful where we need
> > to setup IPC between workers or some other stuff where work is proportional
> > to workers.
>
> That's technically true, but the incremental work involved in
> supporting a new worker is extremely small compare with worker startup
> times. I'm guessing that the setup cost is going to be on the order
> of hundred-thousands or millions and and the startup cost is going to
> be on the order of tens or ones.

Can we safely estimate the cost of restoring parallel state (GUC's,

combo CID, transaction state, snapshot, etc.) in each worker as a setup

cost? There could be some work like restoration of locks (acquire all or

relevant locks at start of parallel worker, if we follow your proposed design

and even if we don't follow that there could be some similar substantial work)

which could be substantial and we need to do the same for each worker.

If you think restoration of parallel state in each worker is a pretty

small work, then what you say makes sense to me.

> And I actually hope you *can't* present some contrary evidence.
> Because if you can, then that might mean that we need to cost every
> possible path from 0 up to N workers and let the costing machinery
> decide which one is better.

Not necesarally, we can follow a rule that number of workers

that need to be used for any parallel statement are equal to degree of

parallelism (parallel_seqscan_degree) as set by user. I think we

need to do some split up of number workers when there are multiple

parallel operations in single statement (like sort and parallel scan).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Pavel Stehule
Дата: 06 мая 2015 г., 14:43:31
Сообщение: is possible to upgrade from 9.2 to 9.4 with pg_upgrade

Следующее

От: Rui Hai Jiang
Дата: 06 мая 2015 г., 16:00:31
Сообщение: Where are the detoast function called in select * from table_name case?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel Seq Scan

Предыдущее

Следующее