Re: POC: Extension for adding distributed tracing - pg_tracing

Поиск
Список
Период
Сортировка
От Anthonin Bonnefoy
Тема Re: POC: Extension for adding distributed tracing - pg_tracing
Дата
Msg-id CAO6_Xqp83Rz7CZcb=OqrMxC07VUw08=NrCnVvjWN7arQYc4xNg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC: Extension for adding distributed tracing - pg_tracing  (Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>)
Ответы Re: POC: Extension for adding distributed tracing - pg_tracing
Список pgsql-hackers
Hi, 

Here's a new patch with changes from the previous discussion: 
- I'm now directly storing nanoseconds duration in the span instead of the instr_time. Using the instr_time macros was a bit awkward as the durations I generate don't necessarily have a starting and ending instr_time. 
Moving to straight nanoseconds made things clearer from my point of view.
- I've added an additional sample rate pg_tracing.sample_rate (on top of the pg_tracing.caller_sample_rate). This one will allow queries to be sampled even without trace information propagated from the caller. 
Setting this sample rate to 1 will basically trace everything. For now, this will only work when going through the post parse hook. I will add support for prepared statements and cached plans for the next patch.
- I've improved how parse spans are created. It's a bit challenging to get the start of a parse as there's no pre parse hook or instrumentation around parse so it's only an estimation.

Regards,
Anthonin

On Fri, Jul 28, 2023 at 4:06 PM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote:
> What do you think about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT and INSTR_TIME_GET_MILLISEC
> macros for timing calculations?
If you're talking of the two instances where I'm modifying the instr_time's ticks, it's because I can't use the macros there.
The first case is for the parse span. I only have the start timestamp using GetCurrentStatementStartTimestamp and don't 
have access to the start instr_time so I need to build the duration from 2 timestamps.
The second case is when building node spans from the planstate. I directly have the duration from Instrumentation.

I guess one fix would be to use an int64 for the span duration to directly store nanoseconds instead of an instr_time 
but I do use the instrumentation macros outside of those two cases to get the duration of other spans.

> Also, have you thought about a way to trace existing (running) queries without directly instrumenting them?
That's a good point. I was focusing on leaving the sampling decision to the caller through the sampled flag and 
only recently added the pg_tracing_sample_rate parameter to give more control. It should be straightforward to 
add an option to create standalone traces based on sample rate alone. This way, setting the sample rate to 1 
would force the queries running in the session to be traced.


On Fri, Jul 28, 2023 at 3:02 PM Nikita Malakhov <hukutoc@gmail.com> wrote:
Hi!

What do you think about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT and INSTR_TIME_GET_MILLISEC
macros for timing calculations?

Also, have you thought about a way to trace existing (running) queries without directly instrumenting them? It would
be much more usable for maintenance and support personnel, because in production environments you rarely could
change query text directly. For the current state the most simple solution is switch tracing on and off by calling SQL
function, and possibly switch tracing for prepared statement the same way, but not for any random query.

I'll check the patch for the race conditions.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Aleksander Alekseev
Дата:
Сообщение: Re: Incorrect handling of OOM in WAL replay leading to data loss
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Oversight in reparameterize_path_by_child leading to executor crash