Re: when the startup process doesn't (logging startup delays)

Поиск
Список
Период
Сортировка
От Nitin Jadhav
Тема Re: when the startup process doesn't (logging startup delays)
Дата
Msg-id CAMm1aWYNQaa+g9c1pQjb_m-syK83t7eT4VjPVJyYSRQuVoBA6A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: when the startup process doesn't (logging startup delays)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: when the startup process doesn't (logging startup delays)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
> That's really not what I'm complaining about. I think if we're going
> to give an example at all, 10ms is a better example than 100ms,
> because 10s is a value that people are more likely to find useful. But
> I'm not sure that it's necessary to mention a specific value, and if
> it is, I think it needs to be phrased in a less confusing way.
>
> > >>It also looks pretty silly to say that if you set the value to 10s, something
> > >>will happen every 10s. What else would anyone expect to happen?
> >
> > @Robert: that's consistent with existing documentation, even though it might
> > seem obvious and silly to us.
> >
> > | For example, if you set this to 250ms then all automatic vacuums and analyzes that run 250ms or longer will be
logged
> > | For example, if you set it to 250ms then all SQL statements that run 250ms or longer will be logged
>
> Fair enough, but I still don't like it much. I tried my hand at
> rewriting this and came up with the attached:
>
> +         Sets the amount of time after which the startup process will log
> +         a message about a long-running operation that is still in progress,
> +         as well as the interval between further progress messages for that
> +         operation. This setting is applied separately to each operation.
> +         For example, if syncing the data directory takes 25 seconds and
> +         thereafter resetting unlogged relations takes 8 seconds, and if this
> +         setting has the default value of 10 seconds, then a messages will be
> +         logged for syncing the data directory after it has been in progress
> +         for 10 seconds and again after it has been in progress for 20 seconds,
> +         but nothing will be logged for resetting unlogged operations.
> +         A setting of <literal>0</literal> disables the feature.
>
> I prefer this to Nitin's version because I think it could be unclear
> to someone that the value applies separately to each operation,
> whereas I don't think we need to document that we can't guarantee that
> the messages will be perfectly on time - that's true of every kind of
> scheduled event in pretty much every computer system - or what happens
> if the system clock goes backwards. Those are things we should try to
> get right, as well as we can anyway, but we don't need to tell the
> user that we got them right.

I thought mentioning the unit in milliseconds and the example in
seconds would confuse the user, so I changed the example to
milliseconds.The message behind the above description looks good to me
however I feel some sentences can be explained in less words. The
information related to the units is missing and I feel it should be
mentioned in the document. The example says, if the setting has the
default value of 10 seconds, then it explains the behaviour. I feel it
may not be the default value, it can be any value set by the user. So
mentioning 'default' in the example does not look good to me. I feel
we just have to mention "if this setting is set to the value of 10
seconds". Below is the modified version of the above information.

+         Sets the amount of time after every such interval the startup process
+         will log a message about a long-running operation that is still in
+         progress. This setting is applied separately to each operation.
+         For example, if syncing the data directory takes 25 seconds and
+         thereafter resetting unlogged relations takes 8 seconds, and if this
+         setting is set to the value of 10 seconds, then a messages will be
+         logged for syncing the data directory after it has been in progress
+         for 10 seconds and again after it has been in progress for 20 seconds,
+         but nothing will be logged for resetting unlogged operations.
+         A setting of <literal>0</literal> disables the feature. If this value
+         is specified without units, it is taken as milliseconds.

> Well, I see that -1 is now disallowed, and that's good as far as it
> goes, but 0 still does not actually disable the feature. I don't
> understand why you posted the previous version of the patch without
> testing that it works, and I even less understand why you are posting
> another version without fixing the bug that I pointed out to you in
> the last version.

I had added additional code to check the value of the
'log_startup_progress_interval' variable and  disable the feature in
case of -1 in the earlier versions of the patch (Specifically
v9.patch). There was a review comment for v9 patch and it resulted in
major refactoring of the patch. The comment was

> With these changes you'd have only 1 place in the code that needs to
> care about log_startup_progress_interval, as opposed to 3 as you have
> it currently, and only one place that enables the timeout, as opposed
> to 2 as you have it currently. I think that would be tidier.

Based on the above comment and the idea behind enabling the timer, it
does not log anything if the value is set to -1. So I thought there is
no extra code necessary to disable the feature even though it executes
through the code flow. So I did not worry about adding logic to
disable the feature. I will take care of this in the next patch.

The answer for the question of "I don't understand why you posted the
previous version of the patch without testing that it works" is, for
the value of -1, the above description was my understanding and for
the value of 0, the older versions of the patch was behaving as
documented. But with the later versions the behaviour got changed and
I missed to modify the documentation. So I modified it in the last
version.

Please share your thoughts.

Thanks & Regards,
Nitin Jadhav
On Mon, Sep 27, 2021 at 9:47 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Sep 27, 2021 at 9:32 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > >>It also looks pretty silly to say that if you set the value to 10s, something
> > >>will happen every 10s. What else would anyone expect to happen?
> >
> > @Robert: that's consistent with existing documentation, even though it might
> > seem obvious and silly to us.
> >
> > | For example, if you set this to 250ms then all automatic vacuums and analyzes that run 250ms or longer will be
logged
> > | For example, if you set it to 250ms then all SQL statements that run 250ms or longer will be logged
>
> Fair enough, but I still don't like it much. I tried my hand at
> rewriting this and came up with the attached:
>
> +         Sets the amount of time after which the startup process will log
> +         a message about a long-running operation that is still in progress,
> +         as well as the interval between further progress messages for that
> +         operation. This setting is applied separately to each operation.
> +         For example, if syncing the data directory takes 25 seconds and
> +         thereafter resetting unlogged relations takes 8 seconds, and if this
> +         setting has the default value of 10 seconds, then a messages will be
> +         logged for syncing the data directory after it has been in progress
> +         for 10 seconds and again after it has been in progress for 20 seconds,
> +         but nothing will be logged for resetting unlogged operations.
> +         A setting of <literal>0</literal> disables the feature.
>
> I prefer this to Nitin's version because I think it could be unclear
> to someone that the value applies separately to each operation,
> whereas I don't think we need to document that we can't guarantee that
> the messages will be perfectly on time - that's true of every kind of
> scheduled event in pretty much every computer system - or what happens
> if the system clock goes backwards. Those are things we should try to
> get right, as well as we can anyway, but we don't need to tell the
> user that we got them right.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: Gather performance analysis
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: [PATCH] Print error when libpq-refs-stamp fails