Re: Logging parallel worker draught

Поиск

Список

Период

Сортировка

От	Imseih (AWS), Sami
Тема	Re: Logging parallel worker draught
Дата	11 октября 2023 г. 18:26:49
Msg-id	9E9E69BD-BABB-49BB-8B69-61939179F20D@amazon.com обсуждение исходный текст
Ответ на	Re: Logging parallel worker draught (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы	Re: Logging parallel worker draught (Benoit Lobréau <benoit.lobreau@dalibo.com>)
Список	pgsql-hackers

Дерево обсуждения

>> Currently explain ( analyze ) will give you the "Workers Planned"
>> and "Workers launched". Logging this via auto_explain is possible, so I am
>> not sure we need additional GUCs or debug levels for this info.
>>
>> -> Gather (cost=10430.00..10430.01 rows=2 width=8) (actual tim
>> e=131.826..134.325 rows=3 loops=1)
>> Workers Planned: 2
>> Workers Launched: 2

> I don't think autoexplain is a good substitute for the originally
> proposed log line. The possibility for log bloat is enormous. Some
> explain plans are gigantic, and I doubt people can afford that kind of
> log traffic just in case these numbers don't match.

Correct, that is a downside of auto_explain in general. 

The logging traffic can be controlled by 
auto_explain.log_min_duration/auto_explain.sample_rate/etc.
of course. 

> Well, if you read Benoit's earlier proposal at [1] you'll see that he
> does propose to have some cumulative stats; this LOG line he proposes
> here is not a substitute for stats, but rather a complement.  I don't
> see any reason to reject this patch even if we do get stats.

> Also, we do have a patch on stats, by Sotolongo and Bonne here [2].  I

Thanks. I will review the threads in depth and see if the ideas can be combined
in a comprehensive proposal.

Regarding the current patch, the latest version removes the separate GUC,
but the user should be able to control this behavior. 

Query text is logged when  log_min_error_statement > default level of "error".

This could be especially problematic when there is a query running more than 1 Parallel
Gather node that is in draught. In those cases each node will end up 
generating a log with the statement text. So, a single query execution could end up 
having multiple log lines with the statement text.

i.e.
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 

I wonder if it will be better to accumulate the total # of workers planned and # of workers launched and
logging this information at the end of execution?

Regards,

Sami

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 11 октября 2023 г., 18:14:24
Сообщение: Re: Add null termination to string received in parallel apply worker

Следующее

От: Tom Lane
Дата: 11 октября 2023 г., 19:04:08
Сообщение: Re: Add null termination to string received in parallel apply worker

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Logging parallel worker draught

Предыдущее

Следующее