Re: Expose lock group leader pid in pg_stat_activity

Поиск
Список
Период
Сортировка
От Julien Rouhaud
Тема Re: Expose lock group leader pid in pg_stat_activity
Дата
Msg-id CAOBaU_ZPHkriFVYF=OrFyLsrXzcdzPizr0SPh+QdNBsHjT2GmQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Expose lock group leader pid in pg_stat_activity  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Expose lock group leader pid in pg_stat_activity
Список pgsql-hackers
On Thu, Jan 16, 2020 at 8:49 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Jan 16, 2020 at 04:27:27PM +0900, Michael Paquier wrote:
> > While looking at the code, I think that we could refactor things a bit
> > for raw_wait_event, wait_event_type and wait_event which has some
> > duplicated code for backend and auxiliary processes.  What about
> > filling in the wait event data after fetching the PGPROC entry, and
> > also fill in leader_pid for auxiliary processes.  This does not matter
> > now, perhaps it will never matter (or not), but that would make the
> > code much more consistent.
>
> And actually, the way you are looking at the leader's PID is visibly
> incorrect and inconsistent because the patch takes no shared LWLock on
> the leader using LockHashPartitionLockByProc() followed by
> LWLockAcquire(), no?  That's incorrect because it could be perfectly
> possible to crash with this code between the moment you check if
> lockGroupLeader is NULL and the moment you look at
> lockGroupLeader->pid if a process is being stopped in-between and
> removes itself from a lock group in ProcKill().  That's also
> inconsistent because it could be perfectly possible to finish with an
> incorrect view of the data while scanning for all the backend entries,
> like a leader set to NULL with workers pointing to the leader for
> example, or even workers marked incorrectly as NULL.  The second one
> may not be a problem, but the first one could be confusing.

Oh indeed.  But unless we hold some LWLock during the whole function
execution, we cannot guarantee a consistent view right?  And isn't it
already possible to e.g. see a parallel worker in pg_stat_activity
while all other queries are shown are idle, if you're unlucky enough?

Also, LockHashPartitionLockByProc requires the leader PGPROC, and
there's no guarantee that we'll see the leader before any of the
workers, so I'm unsure how to implement what you said.  Wouldn't it be
better to simply fetch the leader PGPROC after acquiring a shared
ProcArrayLock, and using that copy to display the pid, after checking
that we retrieved a non-null PGPROC?



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: Expose lock group leader pid in pg_stat_activity
Следующее
От: Rui DeSousa
Дата:
Сообщение: Re: [HACKERS] kqueue