Re: Design of pg_stat_subscription_workers vs pgstats

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Design of pg_stat_subscription_workers vs pgstats
Дата
Msg-id CAA4eK1K12FFF0wNV0k6UtLK9BP=S6PawWvepAAVbx7g+hb245Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Design of pg_stat_subscription_workers vs pgstats  ("David G. Johnston" <david.g.johnston@gmail.com>)
Ответы Re: Design of pg_stat_subscription_workers vs pgstats  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Fri, Jan 28, 2022 at 1:49 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Thu, Jan 27, 2022 at 5:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Jan 27, 2022 at 11:16 AM Andres Freund <andres@anarazel.de> wrote:
>> >
>> > On 2022-01-25 20:27:07 +0900, Masahiko Sawada wrote:
>> >
>> > > There will be some challenges in a case where updating pg_subscription_rel
>> > > also failed too (what to report to the user, etc.). And moreover, we don't
>> > > want to consume space for temporary information in the system catalog.
>> >
>> > You're consuming resources in a *WAY* worse way right now. The stats file gets
>> > constantly written out, and quite often read back by backends. In contrast to
>> > parts of pg_subscription_rel or such that data can't be removed from
>> > shared_buffers under pressure.
>> >
>>
>> I don't think pg_subscription_rel is the right place to store error
>> info as the error can happen say while processing some message type
>> like BEGIN where we can't map it to pg_subscription_rel entry. There
>> could be other cases as well where we won't be able to map it to
>> pg_subscription_rel like some error related to some other table while
>> processing trigger function.
>>
>> In general, there doesn't appear to be much advantage in storing all
>> the error info in system catalogs as we don't want it to be persistent
>> (crash-safe). Also, this information is not about any system object
>> that future operations can use, so won't map from that angle as well.
>
>
> Repeating myself here to try and keep complaints regarding pg_stat_subscription_worker in one place.
>
> This is my specific email with respect to the pg_stat_scription_workers design.
>
> https://www.postgresql.org/message-id/CAKFQuwZbFuPSV1WLiNFuODst1sUZon2Qwbj8d9tT%3D38hMhJfvw%40mail.gmail.com
>
> Specifically,
>
> pg_stat_subscription_workers is defined as showing:
> "will contain one row per subscription
> worker on which errors have occurred, for workers applying logical
> replication changes and workers handling the initial data copy of the
> subscribed tables."
>
> The fact that these errors remain (last_error_*) even after they are no longer relevant is my main gripe regarding
thisfeature.  The information itself is generally useful though last_error_count is not.  These fields should
auto-clearand be named current_error_* as they exist for purposes of describing the current state of any
error-encounteringlogical replication worker so that ALTER SUBSCRIPTION SKIP, or someone other manual intervention, can
bedone with that knowledge without having to scan the subscriber's server logs. 
>

We can discuss names of columns but the main reason was that tomorrow
say we want to account for total errors not only the current error
then we have to introduce the field error_count or something like that
which will then conflict with names like current_*. Similar for
transaction abort_count. In the initial versions of the patch, we were
not using last_* for column names but similar arguments led us to
change names to last_ terminology and the same was being used in
pg_stat_archiver. But, feel free to suggest better names. Yes, I agree
with an auto-clear point as well and there seems to be an agreement
for doing it after the next successful apply and or after we skipped
the failed xact.

> This is my email trying to understand reality better in order to figure out what exactly is causing the limitations
thatare negatively impacting the design of this feature. 
>
> https://www.postgresql.org/message-id/CAKFQuwYJ7dsW%2BStsw5%2BZVoY3nwQ9j6pPt-7oYjGddH-h7uVb%2Bg%40mail.gmail.com
>
> In short, it was convenient to use the statistics collector here even if doing so resulted in a non-user friendly
(IMO)design.  Given all of the limitations to the statistics collection infrastructure, and the fact that this data is
notstatistical in the usual usage of the term, I find that to be less than satisfying. 
>

I think the failures/conflicts are also important information for
users to know, so having a view of those doesn't appear to be a bad
idea. All this data is less suitable for system catalogs like
pg_subscription_rel or others for the reasons quoted in my previous
email [1]. You have already noted one view pg_stat_archiver and we do
have failure information like checksum_failures, deadlocks in some
other views. Then, we have some information like conflicts available
via pg_stat_database_conflicts. I think the error/conflict info about
apply failures is on similar lines.

>  To the point that I'd be inclined to revert this feature and hold up the ALTER SUBSCRIPTION SET patch until a more
user-friendlydesign can be done using proper IPC techniques. 
>

If we find a different/better way and that is a conclusion then I will
do it. But, in my humble opinion, let's first discuss and see why this
is incorrect? IIUC, your main argument was to allow auto skip instead
of allowing users to fetch XID (from a view or server logs) and then
use it in the command. We are already trying to brainstorm that and
Sawada-San has already proposed a couple of ideas [2]. Also, the other
idea Andres has shared is to use LSN (of the corresponding failed
transaction) instead of XID which I find better.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BMDngbOQfMcAMsrf__s2a-MMMHaCR0zwde3GVeEi-bbQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAD21AoBdEcyXKMCMws7HjcYDbbPyq_KfUbCnTX84rDeP45Hbrw%40mail.gmail.com

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: Add checkpoint and redo LSN to LogCheckpointEnd log message
Следующее
От: "tanghy.fnst@fujitsu.com"
Дата:
Сообщение: RE: Support tab completion for upper character inputs in psql