Re: group locking: incomplete patch, just for discussion

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: group locking: incomplete patch, just for discussion
Дата	31 октября 2014 г. 12:55:06
Msg-id	CA+TgmoZzdGh-OkCQHW_P5ViqaEjqi3j=vS3p1=4OWpcbWC6t6g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: group locking: incomplete patch, just for discussion (Simon Riggs <simon@2ndQuadrant.com>)
Ответы	Re: group locking: incomplete patch, just for discussion Re: group locking: incomplete patch, just for discussion
Список	pgsql-hackers

Дерево обсуждения

On Fri, Oct 31, 2014 at 6:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Is it genuinely required for most parallel operations? I think it's
> clear that none of us knows the answer. Sure, the general case needs
> it, but is the general case the same thing as the reasonably common
> case?

Well, I think that the answer is pretty clear.  Most of the time,
perhaps in 99.9% of cases, group locking will make no difference as to
whether a parallel operation succeeds or fails.  Occasionally,
however, it will cause an undetected deadlock.  I don't hear anyone
arguing that that's OK.  Does anyone wish to make that argument?

If not, then we must prevent it.  The only design, other than what
I've proposed here, that seems like it will do that consistently in
all cases is to have the user backend lock every table that the child
backend might possibly want to lock and retain those locks throughout
the entire duration of the computation whether the child would
actually need those locks or not.  I think that could be made to work,
but there are two probems:

1. Turing's theorem being what it is, predicting what catalog tables
the child might lock is not necessarily simple.

2. It might end up taking any more locks than necessary and holding
them for much longer than necessary.  Right now, for example, a
syscache lookup locks the table only if we actually need to read from
it and releases the lock as soon as the actual read is complete.
Under this design, every syscache that the parallel worker might
conceivably consult needs to be locked for the entire duration of the
parallel computation.  I would expect this to provoke a violent
negative reaction from at least one prominent community member (and I
bet users wouldn't like it much, either).

So, I am still of the opinion that group locking makes sense.   While
I appreciate the urge to avoid solving difficult problems where it's
reasonably avoidable, I think we're in danger of spending more effort
trying to avoid solving this particular problem than it would take to
actually solve it.  Based on what I've done so far, I'm guessing that
a complete group locking patch will be between 1000 and 1500 lines of
code and that nearly all of the new logic will be skipped when it's
not in use (i.e. no parallelism).  That sounds to me like a hell of a
deal compared to trying to predict what locks the child might
conceivably take and preemptively acquire them all, which sounds
annoyingly tedious even for simple things (like equality operators)
and nearly impossible for anything more complicated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: group locking: incomplete patch, just for discussion