Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity.

Поиск

Список

Период

Сортировка

От	Raman Kumar
Тема	Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity.
Дата	23 июня 2022 г. 06:33:09
Msg-id	CAG8FmLVDgxkyBPz4d6t9yy63EvzVjcN0Y1nzyhpyV7H3OkkEmg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity. (Amit Kapila <amit.kapila16@gmail.com>)
Список	pgsql-bugs

Дерево обсуждения

We are using a debezium postgres kafka connector, it does handle the empty transaction as far as i know.

About the large transactions, no there aren't any as most of the time the disk usage spikes when the activity is really low.

I am using AWS RDS to host the postgres instance, so I don't really have an idea on how to check the data/pg_replslot directory.

In order to mitigate the issue temporarily, We have added a dummy table in the publication on which we are inserting random. This keeps the replication lag under control and avoid filling up the disk.

On Tue, Jun 21, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 20, 2022 at 7:07 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 17524
> Logged by: Raman Kumar
> Email address: raman.kumar@r-indventures.com
> PostgreSQL version: 13.6
> Operating system: Ubuntu
> Description:
>
> We are using logical replication to source the event from postgres. and
> publication includes a single table with (Insert, update, delete and
> truncate ).
> Everything works great, Until we run into a strange issue i.e. Increase is
> the disk usage, abnormally.
> On looking further we have found that some of the disk space occupy by the
> wal.
> Using this query
> ```
> Select slot_name, pg_current_wal_lsn(), confirmed_flush_lsn,
> (pg_current_wal_lsn() - confirmed_flush_lsn) as lag from
> pg_replication_slots;
> ```
> ```
> "slot_name" "pg_current_wal_lsn" "confirmed_flush_lsn" "lsn_distance"
> "poc_kafka_connect" "109C/AC001020" "109C/94909990" 393180816
> ```
>
> ```
> select usename, client_addr, pg_current_wal_lsn(), flush_lsn,
> (pg_current_wal_lsn() - flush_lsn) as lag from pg_stat_replication;
> ```
>
> ```
> "usename" "client_addr" "pg_current_wal_lsn" "flush_lsn" "lag"
> "postgres" "x.x.x.x" "109C/AC0011A0" "109C/94909990" 393181200
> ```
>
> We have notice the replication lag. But as the publication includes on the
> single table, the last changes are already replicated/published.
> So Technically there is no replication lag as per publication.
>

The lag is computed based on the difference between the WAL location
written in the publisher and the WAL location confirmed by the
subscriber as replayed. Normally for non-published data, the latest
WAL location is confirmed by the subscriber either because we send an
empty transaction (pair of BEGIN ... COMMIT commands) or via
keep-alive messages. Now, it is not clear why none of those is working
in your case? Are there any large transactions happening on the
non-published tables? You can once check data/pg_replslot directory to
confirm if there are any spill files present there?

--
With Regards,
Amit Kapila.

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity.