Re: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single
Дата
Msg-id 802.1208283751@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single  (Bruce Momjian <bruce@momjian.us>)
Re: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> A more interesting question is what makes you think that taking
>> ProcArrayLock here has any value whatsoever.

> I took the lock so I would be sure the PGPROC array was the matching pid
> and not some other pid that was created between my check and the setting
> of the variable.

Unfortunately, that doesn't work at all, even disregarding the fact that
you forgot to make proc.c clear the flag when setting up a new process's
PGPROC.

The race condition would go away if ProcArrayRemove zeroed the pid
field, but I'm afraid that that might break something else; in
particular I think we still need to be able to manipulate LWLocks after
ProcArrayRemove, so I suspect this is not safe either.

I think the only really safe way to do it is to make a new procarray.c
function that searches like BackendPidGetProc, but hands the proc back
with the ProcArrayLock still held.  Or maybe we should just redefine
BackendPidGetProc that way.

There are other race conditions in this patch too; notably that lots of
places feel free to clear QueryCancelPending on-the-fly, thus possibly
letting them disregard a terminate signal.

Even assuming that we can fix the garden-variety bugs like these,
there's still a fundamental problem that an uncooperative user function
(particularly one in plperl/pltcl/plpython) can indefinitely delay
response to pg_terminate_backend.  Maybe that's okay, seeing that it can
similarly hold off or disregard QueryCancel, but I'm not sure the people
asking for this are really gonna be satisfied.  (One thing we should
consider is making ERRCODE_ADMIN_SHUTDOWN unconditionally untrappable
by plpgsql exception blocks, which'd at least fix the issue for plpgsql
functions.)

I'm also thinking that this doesn't fix the problem that we're relying
on die() to not leave shared memory in a bad state.  All it does is
restrict things so that we only try that from the main loop rather than
at any random CHECK_FOR_INTERRUPTS.  That certainly reduces the odds of
hitting a bug of this type, but I don't see that it can be said to make
them zero.

All in all, this patch wasn't ready to apply without review.  I'm
currently looking to see whether it's salvageable or not.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Chernow
Дата:
Сообщение: Re: pulling libpqtypes from queue
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pulling libpqtypes from queue