Re: max_parallel_degree context level

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: max_parallel_degree context level
Дата
Msg-id CA+TgmoYzmhaMPnqB0YCzVST3xfgg2UBJV3fWi2aN58HQh04e4Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: max_parallel_degree context level  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: max_parallel_degree context level  (Joe Conway <mail@joeconway.com>)
Re: max_parallel_degree context level  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Feb 11, 2016 at 10:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> A few questions and thoughts to help decide...
>
> Does it take into account the parallel degree during planning?
> Does it take into account the actual parallel degree during planning?

max_worker_processes is a query planner GUC, just like work_mem.  Just
as we can't know how much memory will be available at planning time,
we can't know how many worker processes will be available at execution
time.  In each case, we have a GUC that tells the system what to
assume.  In each case also, some better model might be possible, but
today we don't have it.

> If you make max_worker_processes USERSET won't everybody just set it to
> max_worker_processes?

I think that you meant for the first instance of max_worker_processes
in that sentence to be max_parallel_degree.  I'll respond as if that's
what you meant.  Basically, I think this like asking whether everybody
won't just set work_mem to the entire amount of free memory on the
machine and try to use it all themselves.  We really have never tried
very hard to prevent that sort of thing in PostgreSQL.  Maybe we
should, but we'd have to fix an awful lot of stuff.  There are many
ways for malicious users to do things that interfere with the ability
of other users to use the system.  I admit that the same problem
exists here, but I don't think it's any more severe than any of the
cases that already exist.  In some ways I think it's a whole lot LESS
serious than what a bad work_mem setting an do to your system.

> How does the server behave when less servers are available than
> max_parallel_degree?

The same query plan is executed with fewer workers, even with 0
workers.  If we chose a parallel plan that is a mirror of the
non-parallel plan we would have chosen, this doesn't cost much.  If
there's some other non-parallel plan that would be much faster and we
only picked this parallel plan because we thought we would have
several workers available, and then we get fewer or none, that might
be expensive.  One can imagine a system that always computes both a
parallel plan and a non-parallel plan and chooses between them at
runtime, or even multiple plans for varying number of workers, but we
don't have that today.  I am not actually sure it would be worth it.

Basically, I think this comes back to the analogy between
max_parallel_degree and work_mem.  If you set work_mem too high and
the system starts swapping and becomes very slow, that's your fault
(we say) for setting an unreasonable value of work_mem.  Similarly, if
you set max_parallel_degree to an unreasonable value such that the
system is unlikely to be able to obtain that number of workers at
execution time, you have configured your query planner settings
poorly.  This is no different than setting random_page_cost lower than
seq_page_cost or any number of other dumb things you could do.

> Is it slower if you request N workers, yet only 1 is available?

I sure hope so.  There may be some cases where more workers are slower
than fewer workers, but those cases are defects that we should try to
fix.

> Does pg_stat_activity show the number of parallel workers active for a
> controliing process?
> Do parallel workers also show in pg_stat_activity at all?
> If so, does it show who currently has them?
> Does pg_stat_statements record how many workers were available during
> execution?

Background workers show up in pg_stat_activity, but the number of
workers used by a parallel query isn't reported anywhere.  It's
usually pretty easy to figure out from the EXPLAIN (ANALYZE, VERBOSE)
output, but clearly there might be some benefit in reporting it to
other monitoring facilities.  I hadn't really thought about that idea
before, but it's a good thought.

> Is there a way to prevent execution if too few parallel workers are
> available?

No. That might be a useful feature, but I don't have any plans to
implement it myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Следующее
От: Joe Conway
Дата:
Сообщение: Re: max_parallel_degree context level