Re: RFC: Logging plan of the running query

Поиск
Список
Период
Сортировка
От torikoshia
Тема Re: RFC: Logging plan of the running query
Дата
Msg-id e954285e4f6907fd479ebddf682acf52@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: RFC: Logging plan of the running query  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: RFC: Logging plan of the running query  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
On 2022-01-27 20:18, Fujii Masao wrote:
> Here are another review comments:

Thanks for reviewing!

> +LOG:  plan of the query running on backend with PID 17793 is:
> 
> This seems not the same as what actually logged.

Modified.

> +        ereport(WARNING,
> +                (errmsg("PID %d is not a PostgreSQL server process", pid)));
> 
> Like commit 7fa945b857 changed, this warning message should be "PID %d
> is not a PostgreSQL backend process"?

Modified.

> +    if (SendProcSignal(pid, PROCSIG_LOG_CURRENT_PLAN, InvalidBackendId) < 
> 0)
> 
> proc->backendId should be specified instead of InvalidBackendId, to
> speed up the processing in SendProcSignal()?

Agreed. Modified.

> +    PROCSIG_LOG_CURRENT_PLAN, /* ask backend to log plan of the current 
> query */
> +volatile sig_atomic_t LogCurrentPlanPending = false;
> +extern void HandleLogCurrentPlanInterrupt(void);
> +extern void ProcessLogCurrentPlanInterrupt(void);
> 
> Isn't it better to use the names that are more consistent with the
> function name, i.e., pg_log_query_plan? For example,
> PROCSIG_LOG_QUERY_PLAN instead of PROCSIG_LOG_CURRENT_PLAN?

Agreed.
I removed 'current' from the variable and function names and used 
'query' instead.

> +        ereport(LOG_SERVER_ONLY,
> +                errmsg("backend with PID %d is not running a query",
> +                    MyProcPid));
> 
> errhidestmt(true) and errhidecontext(true) need to be added, don't
> they? Otherwise, for example, if pg_log_query_plan() is executed after
> debug_query_string is set but before ActiveQueryDesc is set, STATEMENT
> message would be output even though the message saying "not running a
> query" is output. Which seems confusing.

Agreed. Added errhidestmt(true) and errhidecontext(true).

> +    hash_seq_init(&status, GetLockMethodLocalHash());
> +    while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
> +    {
> +        if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
> 
> Why did you use the search for local lock hash instead of
> IsPageLockHeld flag variable, to check whether a page lock is held or
> not? Because there is the corner case where the interrupt is processed
> after the local lock is registered into the hash but before
> IsPageLockHeld is enabled?

As far as I read CheckAndSetLockHeld(), IsPageLockHeld can be used only 
when USE_ASSERT_CHECKING is enabled.
Since removing USE_ASSERT_CHECKING from CheckAndSetLockHeld() would give 
performance impact on every granting/removing local lock, I used the 
search for local local hash.

> There is the case where the request to log a query plan is skipped
> even while the target backend is running a query. If this happens,
> users can just retry pg_log_query_plan(). These things should be
> documented?

Agreed.
Added following:

   +        Note that there is the case where the request to log a query
   +        plan is skipped even while the target backend is running a
   +        query due to lock conflict avoidance.
   +        If this happens, users can just retry pg_log_query_plan().

|

> +            ereport(LOG_SERVER_ONLY,
> +                errmsg("backend with PID %d is holding a page lock. Try again",
> +                    MyProcPid));
> 
> It seems more proper to output this message in DETAIL or HINT,
> instead. So how about something like the following messages?
> 
> LOG: could not log the query plan
> DETAIL: query plan cannot be logged while page level lock is being held
> HINT: Try pg_log_query_plan() after a few ....

Agreed.
I felt the HINT message 'after a few ...' is difficult to describe, and 
wrote as below.

| HINT: Retrying pg_log_query_plan() might succeed since the lock 
duration of page level locks are usually short

How do you think?

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: Column Filtering in Logical Replication
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: pg_log_backend_memory_contexts - remove unnecessary test case