Re: [WIP] Patches to enable extraction state of query execution from external session

Поиск
Список
Период
Сортировка
От Maksim Milyutin
Тема Re: [WIP] Patches to enable extraction state of query execution from external session
Дата
Msg-id 21a197c5-7026-0915-d29f-bed598d7239c@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [WIP] Patches to enable extraction state of query execution from external session  (Oleksandr Shulgin <oleksandr.shulgin@zalando.de>)
Ответы Re: [WIP] Patches to enable extraction state of query execution from external session  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin
> <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru>> wrote:
>
>         On Mon, Aug 29, 2016 at 5:22 PM, maksim
>         <m.milyutin@postgrespro.ru <mailto:m.milyutin@postgrespro.ru>
>         <mailto:m.milyutin@postgrespro.ru
>         <mailto:m.milyutin@postgrespro.ru>>> wrote:
>
>             Hi, hackers!
>
>             Now I complete extension that provides facility to see the
>         current
>             state of query execution working on external session in form of
>             EXPLAIN ANALYZE output. This extension works on 9.5 version,
>         for 9.6
>             and later it doesn't support detailed statistics for
>         parallel nodes yet.
>
>             I want to present patches to the latest version of
>         PostgreSQL core
>             to enable this extension.
>
>         Hello,
>
>         Did you publish the extension itself yet?
>
>
>     Hello, extension for version 9.5 is available in repository
>     https://github.com/postgrespro/pg_query_state/tree/master
>     <https://github.com/postgrespro/pg_query_state/tree/master>.
>
>         Last year (actually, exactly one year ago) I was trying to do
>         something
>         very similar, and it quickly turned out that signals are not the
>         best
>         way to make this sort of inspection.  You can find the discussion
>         here:
>         https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com
>         <https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com>
>
>
>     Thanks for link!
>
>     My patch *custom_signal.patch* resolves the problem of «heavy»
>     signal handlers. In essence, I follow the course offered in
>     *procsignal.c* file. They define *ProcSignalReason* values on which
>     the SUGUSR1 is multiplexed. Signal recent causes setting flags for
>     *ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only
>     sets specific flags. When CHECK_FOR_INTERRUPTS appears later on
>     query execution *ProcessInterrupt* is called. Then triggered user
>     defined signal handler is executed.
>     As a result, we have a deferred signal handling.
>
>
> Yes, but the problem is that nothing gives you the guarantee that at the
> moment you decide to handle the interrupt, the QueryDesc structures
> you're looking at are in a good shape for Explain* functions to run on
> them.  Even if that appears to be the case in your testing now, there's
> no way to tell if that will be the case at any future point in time.

CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc 
structure) is more or less consistent. In these macro calls I pass 
QueryDesc to Explain* functions. I exactly know that elementary 
statistics updating functions (e.g. InstrStartNode, InstrStopNode, etc) 
don't contain CHECK_FOR_INTERRUPTS within itself therefore statistics at 
least on node level is consistent when backend will be ready to transfer 
its state.
The problem may be in interpretation of collected statistics in Explain* 
functions. In my practice there was the case when wrong number of 
inserted rows is shown under INSERT ON CONFLICT request. That request 
consisted of two parts: SELECT from table and INSERT with check on 
predicate. And I was interrupted between these parts. Formula for 
inserted rows was the number of extracting rows from SELECT minus 
rejected rows from INSERT. And I got imaginary inserted row. I removed 
the printing number of inserted rows under explain of running query 
because I don't know whether INSERT node has processed that last row. 
But the remaining number of rejected rows was deterministic and I showed it.

> Another problem is use if shm_mq facility, because it manipulates the
> state of process latch.  This is not supposed to happen to a backend
> happily performing its tasks, at random due to external factors, and
> this is what the proposed approach introduces

In Postgres source code the most WaitLatch() call on process latch is 
surrounded by loop and forms the pattern like this:
  for (;;)  {     if (desired_state_has_occured)       break;
     WaitLatch(MyLatch);     CHECK_FOR_INTERRUPTS();     ResetLatch(MyLatch)  }

The motivation of this decision is pretty clear illustrated by the 
extract from comment in Postgres core:

usage of "the generic process latch has to be robust against unrelated 
wakeups: Always check that the desired state has occurred, and wait 
again if not"[1].

I mean that random setting of process latch at the time of query 
executing don't affect on another usage of that latch later in code.


1. src/backend/storage/lmgr/proc.c:1724 for ProcWaitForSignal function

-- 
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Missing checks when malloc returns NULL...
Следующее
От: Fabrízio de Royes Mello
Дата:
Сообщение: Re: PostgreSQL 10 kick-off