Re: Failed transaction statistics to measure the logical replication progress
От | Masahiko Sawada |
---|---|
Тема | Re: Failed transaction statistics to measure the logical replication progress |
Дата | |
Msg-id | CAD21AoC_p=gQKhhGwhDLzEU9rGwCWSzTk+a89XGG1JQKn-PiKw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Failed transaction statistics to measure the logical replication progress (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Failed transaction statistics to measure the logical replication progress
(Amit Kapila <amit.kapila16@gmail.com>)
Re: Failed transaction statistics to measure the logical replication progress (vignesh C <vignesh21@gmail.com>) Re: Failed transaction statistics to measure the logical replication progress (Greg Nancarrow <gregn4422@gmail.com>) RE: Failed transaction statistics to measure the logical replication progress ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
Список | pgsql-hackers |
On Thu, Nov 18, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Nov 17, 2021 at 7:12 PM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> wrote: > > > > On Wednesday, November 17, 2021 10:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Can you please tell us why you think the names in your proposed patch are > > > better than the existing names proposed in Sawada-San's patch? Is it because > > > those fields always contain the information of the last or latest error that > > > occurred in the corresponding subscription worker? > > This is one reason. > > > > Another big reason comes from the final alignment when we list up all columns of both patches. > > The patches in this thread is trying to introduce a column that indicates > > cumulative count of error to show all error counts that the worker got in the past. > > > > Okay, I see your point and it makes sense to rename columns after > these other stats. I am not able to come up with any better names than > what is being used here. Sawada-San, do you agree with this, or do let > us know if you have any better ideas? > I'm concerned that these new names will introduce confusion; if we have last_error_relid, last_error_command, last_error_message, last_error_time, and last_error_xid, I think users might think that first_error_time is the timestamp at which an error occurred for the first time in the subscription worker. Also, I'm not sure last_error_count is not clear to me (it looks like showing something count but the only "last" one?). An alternative idea would be to add total_error_count by this patch, resulting in having both error_count and total_error_count. Regarding commit_count and abort_count, I personally think xact_commit and xact_rollback would be better since they’re more consistent with pg_stat_database view, although we might already have discussed that. Besides that, I’m not sure how useful commit_bytes, abort_bytes, and error_bytes are. I originally thought these statistics track the size of received data, i.g., how much data is transferred from the publisher and processed on the subscriber. But what the view currently has is how much memory is used in the subscription worker. The subscription worker emulates ReorderBufferChangeSize() on the subscriber side but, as the comment of update_apply_change_size() mentions, the size in the view is not accurate: + * The byte size of transaction on the publisher is calculated by + * ReorderBufferChangeSize() based on the ReorderBufferChange structure. + * But on the subscriber, consumed resources are not same as the + * publisher's decoding processsing and required to be computed in + * different way. Therefore, the exact same byte size is not restored on + * the subscriber usually. Also, it seems to take into account the size of FlushPosition that is not taken into account by ReorderBufferChangeSize(). I guess that the purpose of these values is to compare them to total_bytes, stream_byte, and spill_bytes but if the calculation is not accurate, does it mean that the more stats are updated, the more the stats will be getting inaccurate? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
В списке pgsql-hackers по дате отправления: