Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Дата
Msg-id CAA4eK1+DSd2Kv2dbGfKVDbzW_ap11=sY9=Xca7z4nbSM4eB1vA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, Feb 24, 2016 at 7:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 22, 2016 at 10:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I wouldn't bother tinkering with it at this point.  The value isn't
>> going to be recorded on disk anywhere, so it will be easy to change
>> the way it's computed in the future if we ever need to do that.
>>
>
> Okay. Find the rebased patch attached with this mail.  I have moved
> this patch to upcoming CF.

I would call the functions pgstat_report_wait_start() and
pgstat_report_wait_end() instead of pgstat_report_start_waiting() and
pgstat_report_end_waiting().

I think pgstat_get_wait_event_type should not return HWLock, a term
that appears nowhere in our source tree at present.  How about just
"Lock"?

I think the wait event types should be documented - and the wait
events too, perhaps.

Maybe it's worth having separate wait event type names for lwlocks and
lwlock tranches.  We could report LWLockNamed and LWLockTranche and
document the difference: "LWLockNamed indicates that the backend is
waiting for a specific, named LWLock.  The event name is the name of
that lock.  LWLockTranche indicates that the backend is waiting for
any one of a group of locks with similar function.  The event name
identifies the general purpose of locks in that group."


Agreed with all the above points and will change the patch accordingly.
 
There's no requirement that every session have every tranche
registered.  I think we should consider displaying "extension" for any
tranche that's not built-in, or at least for tranches that are not
registered (rather than "unknown wait event").


I think it is better to display as an "extension" for unregistered tranches,
but do you see any case where we will have wait event information
for any unregistered tranche?

Another point to consider is that if it is not possible to have wait event
for unregisteredtranche, then should we have Assert or elog(ERROR)
instead of "unknown wait event"?

 
+       if (lock->tranche == 0 && lockId < NUM_INDIVIDUAL_LWLOCKS)

Isn't the second part automatically true at this point?


Yes, that point is automatically true and I think we should change
the same check in PRINT_LWDEBUG and LOG_LWDEBUG, although
as separate patches.

 

The changes to LockBufferForCleanup() don't look right to me.  Waiting
for a buffer pin might be a reasonable thing to define as a wait
event, but it shouldn't reported as if we were waiting on the LWLock
itself.

makes sense, how about having a new wait class, something like
WAIT_BUFFER and then have wait event type as Buffer and
wait event as BufferPin.  At this moment, I think there will be
only one event in this class, but it seems to me waiting on buffer
has merit to be considered as a separate class.  
 

What happens if an error is thrown while we're in a wait?


For LWLocks, only FATAL error is possible which will anyway lead
to initialization of all backend states.  For lock.c, if an error is
thrown, then state is reset in Catch block. In
LockBufferForCleanup(), after we set the wait event and before we
reset it, there is only a chance of FATAL error, if any system call
fails.  We do have one error in enable_timeouts which is called from
ResolveRecoveryConflictWithBufferPin(), but that doesn't seem to
be possible.  Now one question to answer is that what if tomorrow
some one adds new error after we set the wait state, so may be
it is better to clear wait event in AbortTransaction()?

 
Does this patch hurt performance?

 
This patch adds additional code in the path where we are going
to sleep/wait, but we have changed some shared memory structure, so
it is good to verify performance tests.  I am planning to run read-only
and read-write pgbench tests (when data fits in shared), is that
sufficient or do you expect anything more?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Prepared Statement support for Parallel query
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: The plan for FDW-based sharding