Re: On-demand running query plans using auto_explain and signals

Поиск
Список
Период
Сортировка
От Shulgin, Oleksandr
Тема Re: On-demand running query plans using auto_explain and signals
Дата
Msg-id CACACo5SseKsRBM0i__Ypg62kwGT-WF3a=ncE43yvhPbBkzkxKQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: On-demand running query plans using auto_explain and signals  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: On-demand running query plans using auto_explain and signals  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: On-demand running query plans using auto_explain and signals  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Sat, Sep 12, 2015 at 11:50 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,

I did a quick initial review of this patch today, so here are my comments so far:

Hi Tomas,

First of all, thanks for the review!

- ipcs.c should include utils/cmdstatus.h (the compiler complains
  about implicit declaration of two functions)

Correct, though the file name is ipci.c.

- Attempts to get plan for simple insert queries like this

      INSERT INTO x SELECT * FROM x;

  end with a segfault, because ActivePortal->queryDesc is 0x0 for this
  query. Needs investigation.

Yes, I've hit the same problem after submitting the latest version of the patch.  For now I've just added a check for queryDesc being not NULL, but I guess the top of the current_query_stack might contains something useful.  Something I need to try.

- The lockless approach seems fine to me, although I think the fear
  of performance issues is a bit moot (I don't think we expect large
  number of processes calling pg_cmdstatus at the same time). But
  it's not significantly more complex, so why not.

I believe the main benefit of the less-locking approach is that if something goes wrong when two backends tried to communicate it doesn't prevent the rest of them from operating, because there is no shared (and thus locked) communication channel.

- The patch contains pretty much no documentation, both comments
  at the code level and user docs. The lack of user docs is not that
  a big deal at this point (although the patch seems to be mature
  enough, although the user-level API will likely change).

  The lack of code comments is more serious, as it makes the review
  somewhat more difficult. For example it'd be very nice to document
  the contract for the lock-less interface.

I will add the code comments.  The user docs could wait before we decide on the interface, I think.

- I agree that pg_cmdstatus() is not the best API. Having something
  like EXPLAIN PID would be nice, but it does not really work for
  all the request types (just CMD_STATUS_REQUEST_EXPLAIN). Maybe
  there's not a single API for all cases, i.e. we should use EXPLAIN
  PID for one case and invent something different for the other?

I can think of something like:

EXPLAIN [ ( option [, ...] ) ] PROCESS <PID>;

where option is extended with:

  QUERY
  PROGRESS
  BACKTRACE

in addition to the usual ANALYZE, VERBOSE, FORMAT, etc.

- Is there a particular reason why we allocate slots for auxiliary
  processes and not just for backends (NumBackends)? Do we expect those
  auxiliary processes to ever use this API?

If we extend the interface to a more general one, there still might be some space for querying status of checkpointer of bgwriter.

- CleanupCmdStatusSlot seems needlessly complicated. I don't quite see
  the need for the second argument, or the internal slot variable. Why
  not to simply use the MyCmdStatusSlot directly?

Good point.

- I also don't quite understand why we need to track css_pid for the
  slot? In what scenario will this actually matter?

I think it's being only used for error reporting and could help in debugging, but for now that's it.

- While being able to get EXPLAIN from the running process is nice,
  I'm especially interested in getting EXPLAIN ANALYZE to get insight
  into the progress of the execution. The are other ways to get the
  EXPLAIN, e.g. by opening a different connection and actually running
  it (sure, the plan might have changed since then), but currently
  there's no way to get insight into the progress.

  From the thread I get the impression that Oleksandr also finds this
  useful - correct? What are the plans in this direction?

  ISTM we need at least two things for that to work:

  (a) Ability to enable instrumentation on all queries (effectively
      what auto_explain allows), otherwise we can't get EXPLAIN ANALYZE
      on the queries later. But auto_explain is an extension, so that
      does not seem as a good match if this is supposed to be in core.
      In that case a separate GUC seems appropriate.

  (b) Being able to run the InstrEnd* methods repeatedly - the initial
      message in this thread mentions issues with InstrEndLoop for
      example. So perhaps this is non-trivial.

I was able to make this work with a simple change to InstrEndLoop and the callers.  Basically, adding a bool parameter in_explain and passing an appropriate value.  I guess that's not the best approach, but it appears to work.

Adding a GUC to enable instrumentation sounds reasonable.

Do you believe it makes sense to add instrumentation support in this same patch or better focus on making the simplest thing work first?

- And finally, I think we should really support all existing EXPLAIN
  formats, not just text. We need to support the other formats (yaml,
  json, xml) if we want to use the EXPLAIN PID approach, and it also
  makes the plans easier to process by additional tools.

Sure, that was in my plans (and see above for possible syntax).  What would be really neat is retrieving the complete backtrace.  Not sure what the good interface would look like, but using JSON format for the output sounds promising.

--
Alex

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dean Rasheed
Дата:
Сообщение: Re: RLS open items are vague and unactionable
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: extend pgbench expressions with functions