Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Parallel Seq Scan
Дата
Msg-id CAApHDvrgoNwFS4yraDLLcKqLzHcKPyEDMK6n=OVTZbMaRNGURQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 21 April 2015 at 06:26, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 8, 2015 at 3:34 AM, David Rowley <dgrowleyml@gmail.com> wrote:
> In summary it sounds like with my idea we get:
>
> Pros
> * Optimal plan if no workers are available at execution time.
> * Parallelism possible if the chosen optimal plan happens to support
> parallelism, e.g not index scan.
> * No planning overhead

The third one isn't really true.  You've just moved some of the
planning to execution time.


Hmm, sorry, I meant no planner overhead during normal planning.
I was more driving along the lines of the fact that low cost queries don't have to pay the price for the planner considering parallel paths. This "parallelizer" that I keep talking about would only be asked to do anything if the root node's cost was above some GUC like parallel_cost_threshold, and likely a default for this would be some cost that would translate into a query that took, say roughly anything over 1 second. This way super fast 1 millisecond plans don't have to suffer from extra time taken to consider parallel paths.  Once we're processing queries that are above this parallel threshold then the cost of the parallelizer invocation would be drowned out by the actual execution cost anyway.

 
> Cons:
> * The plan "Parallelizer" must make changes to the plan just before
> execution time, which ruins the 1 to 1 ratio of plan/executor nodes by the
> time you inject Funnel nodes.
>
> If we parallelise during planning time:
>
> Pros
> * More chance of getting a parallel friendly plan which could end up being
> very fast if we get enough workers at executor time.

This, to me, is by far the biggest "con" of trying to do something at
execution time.  If planning doesn't take into account the gains that
are possible from parallelism, then you'll only be able to come up
with the best parallel plan when it happens to be a parallelized
version of the best serial plan.  So long as the only parallel
operator is parallel seq scan, that will probably be a common
scenario.  But once we assemble a decent selection of parallel
operators, and a reasonably intelligent parallel query optimizer, I'm
not so sure it'll still be true.


I agree with that. It's a tough one. 
I was hoping that this might be offset by the fact that we won't have to pay the high price when the planner spits out a parallel plan when the executor has no spare workers to execute it as intended, and also the we wouldn't have to be nearly as conservative with the max_parallel_degree GUC, that could just be set to the number of logical CPUs in the machine, and we could just use that value minus number of active backends during execution. 
 
> Cons:
> * May produce non optimal plans if no worker processes are available during
> execution time.
> * Planning overhead for considering parallel paths.
> * The parallel plan may blow out buffer caches due to increased I/O of
> parallel plan.
>
> Of course please say if I've missed any pro or con.

I think I generally agree with your list; but we might not agree on
the relative importance of the items on it.


I've also been thinking about how, instead of having to have a special PartialSeqScan node which contains a bunch of code to store tuples in a shared memory queue, could we not have a "TupleBuffer", or "ParallelTupleReader" node, one of which would always be the root node of a plan branch that's handed off to a worker process. This node would just try to keep it's shared tuple store full, and perhaps once it fills it could have a bit of a sleep and be woken up when there's a bit more space on the queue. When no more tuples were available from the node below this, then the worker could exit. (providing there was no rescan required)

I think between the Funnel node and a ParallelTupleReader we could actually parallelise plans that don't even have parallel safe nodes.... Let me explain:

Let's say we have a 4 way join, and the join order must be {a,b}, {c,d} => {a,b,c,d}, Assuming the cost of joining a to b and c to d are around the same, the Parallelizer may notice this and decide to inject a Funnel and then ParallelTupleReader just below the node for c join d and have c join d in parallel. Meanwhile the main worker process could be executing the root node, as normal. This way the main worker wouldn't have to go to the trouble of joining c to d itself as the worker would have done all that hard work.

I know the current patch is still very early in the evolution of PostgreSQL's parallel query, but how would that work with the current method of selecting which parts of the plan to parallelise? I really think the plan needs to be a complete plan before it can be best analysed on how to divide the workload between workers, and also, it would be quite useful to know how many workers are going to be able to lend a hand in order to know best how to divide the plan up as evenly as possible.

Apologies if this seems like complete rubbish, or if it seems like parallel query mark 3, when we're not done yet with mark 1. I just can't see how, with the current approach how we could just parallelise normal plans like the 4 way join I describe above and I think it would be a shame if we developed down a path that made this not possible.

Regards

David Rowley
 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Optimization for updating foreign tables in Postgres FDW
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Streaming replication and WAL archive interactions