Re: RFC: Logging plan of the running query

Поиск
Список
Период
Сортировка
От James Coleman
Тема Re: RFC: Logging plan of the running query
Дата
Msg-id CAAaqYe8LXVXQhYy3yT0QOHUymdM=uha0dJ0=BEPzVAx2nG1gsw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: RFC: Logging plan of the running query  (torikoshia <torikoshia@oss.nttdata.com>)
Ответы Re: RFC: Logging plan of the running query  (torikoshia <torikoshia@oss.nttdata.com>)
Список pgsql-hackers
On Mon, Jun 5, 2023 at 4:30 AM torikoshia <torikoshia@oss.nttdata.com> wrote:
>
> On 2023-06-03 02:51, James Coleman wrote:
> > Hello,
> >
> > Thanks for working on this patch!

Sure thing! I'm *very interested* in seeing this available, and I
think it paves the way for some additional features later on...

> > On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia@oss.nttdata.com>
> ...
> > To put it positively: we believe that, for example, catalog accesses
> > inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
> > an existing valid transaction/query state, as it would be for this
> > patch -- are safe. If there were problems, then those problems are
> > likely bugs we already have in other CFI cases.
>
> Thanks a lot for the discussion and sharing it!
> I really appreciate it.
>
> BTW I'm not sure whether all the CFI are called in valid transaction,
> do you think we should check each of them?

I kicked off the regressions tests with a call to
ProcessLogQueryPlanInterrupt() in every single CHECK_FOR_INTERRUPTS()
call. Several hours and 52 GB of logs later I have confirmed that
(with the attached revision) at the very least the regression test
suite can't trigger any kind of failures regardless of when we trigger
this. The existing code in the patch for only running the explain when
there's an active query handling that.

> > Another concern Robert raised may apply here: what if a person tries
> > to cancel a query when we're explaining? I believe that's a reasonable
> > question to ask, but I believe it's a trade-off that's worth making
> > for the (significant) introspection benefits this patch would provide.
>
> Is the concern here limited to the case where explain code goes crazy
> as Robert pointed out?
> If so, this may be a trade-off worth doing.
> I am a little concerned about whether there will be cases where the
> explain code is not problematic but just takes long time.

The explain code could take a long time in some rare cases (e.g., we
discovered a bug a few years back with the planning code that actually
descends an index to find the max value), but I think the trade-off is
worth it.

> > On to the patch itself: overall I think it looks like it's in pretty
> > good shape. I also noticed we don't have any tests (I assume it'd have
> > to be TAP tests) of the actual output happening, and I think it would
> > be worth adding that.
>
> Checking the log output may be better, but I didn't add it since there
> is only a little content that can be checked, and similar function
> pg_log_backend_memory_contexts() does not do such type of tests.

Fair enough. I still think that would be an improvement here, but
others could also weigh in.

> > Are you interested in re-opening this patch? I'd be happy to provide
> > further review and help to try to push this along.
> Sure, I'm going to re-open this.

I've attached v27. The important change here in 0001 is that it
guarantees the interrupt handler is re-entrant, since that was a bug
exposed by my testing. I've also included 0002 which is only meant for
testing (it attempts to log in the plan in every
CHECK_FOR_INTERRUPTS() call).

Regards,
James

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Pyhalov
Дата:
Сообщение: Re: Partial aggregates pushdown
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Let's make PostgreSQL multi-threaded