Обсуждение: wal recycling problem

Поиск
Список
Период
Сортировка

wal recycling problem

От
Fabrice Chapuis
Дата:
Hello,

I have a question about the automatic removal of unused WAL files. When loading data with pg_restore (200Gb) we noticed that a lot of WALs files are generated and they are not purged automatically nor recycled despite frequent checkpoints, then pg_wal folder (150Gb) fill and become out of space.
We have a cluster of 2 members (1 primary and 1 standby) with Postgres version 14.9 and 2 barman server, slots are only configured for barman, barman is version 3.7.
The archive command is desactivated (archive_command=':')
I use pg_archivecleanup (with the wal file generated from the last checkpoint in parameter) to remove files manually before the limit of 150Gb so that the restore can terminate.

Why does postgres do not this cleanup automatically, which part of the code is responsible for removing or recycling the wals?

Thanks for your help

Fabrice

Re: wal recycling problem

От
Christoph Moench-Tegeder
Дата:
## Fabrice Chapuis (fabrice636861@gmail.com):

> We have a cluster of 2 members (1 primary and 1 standby) with Postgres
> version 14.9 and 2 barman server, slots are only configured for barman,
> barman is version 3.7.

The obvious question here is: can both of those barmans keep up with
your database, or are you seeing WAL retention due to exactly these
replication slots? (Check pg_replication_slots).

Regards,
Christoph

-- 
Spare Space



Re: wal recycling problem

От
Fabrice Chapuis
Дата:
Yes, barman replication can keep up with primary, wals segments size are under max_wal_size (24Gb in our configuration)

Here is  pg_replication_slots view:

barman_ge      physical  f          t            39409 1EE2/49000000  reserved    f         
barman_be      physical  f          t            39434 1EE2/3D000000 reserved    f         

on the other hand there are 2 slots for logical replication which display status extended. I don't understand why given that the confirmed_flush_lsn field that is up to date. The restart_lsn remains frozen, for what reason?

pgoutput │ logical   │ 2667915 │ db019a00 │ f         │ t      │    1880162 │      │     68512101 │ 1ECA/37C3F1B8 │ 1EE2/4D6BDCF8       │ extended   │               │ f         │
pgoutput │ logical   │ 2668584 │ db038a00 │ f         │ t      │     363230  │      │     68512101 │ 1ECA/37C3F1B8 │ 1EE2/4D6BDCF8       │ extended   │               │ f         │

Regards
Fabrice

On Thu, Sep 28, 2023 at 7:59 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
## Fabrice Chapuis (fabrice636861@gmail.com):

> We have a cluster of 2 members (1 primary and 1 standby) with Postgres
> version 14.9 and 2 barman server, slots are only configured for barman,
> barman is version 3.7.

The obvious question here is: can both of those barmans keep up with
your database, or are you seeing WAL retention due to exactly these
replication slots? (Check pg_replication_slots).

Regards,
Christoph

--
Spare Space

Re: wal recycling problem

От
Christoph Moench-Tegeder
Дата:
Hi,

## Fabrice Chapuis (fabrice636861@gmail.com):

> on the other hand there are 2 slots for logical replication which display
> status extended. I don't understand why given that the confirmed_flush_lsn
> field that is up to date. The restart_lsn remains frozen, for what reason?

There you have it - "extended" means "holding wal". And as long as the
restart_lsn does not advance, checkpointer cannot free any wal beyond
that lsn. My first idea would be some long-running (or huge) transaction
which is in process (active or still being streamed). I'd recommend
looking into what the clients on these slots are doing.

Regards,
Christoph

-- 
Spare Space



Re: wal recycling problem

От
Fabrice Chapuis
Дата:
Thanks Christoph for your message.
Now I understand why the wals are preserved if logical replication is configured and enabled. The problem is that when a large volume of data is loaded into a database, for example during a pg_restore, the wal sender process associated with the logical replication slot will have to decrypt all of the wals generated during this operation which will take a long time and the restart_lsn will not be modified.
From a conceptual point of view I think that specific wals per subscription should be used and stored in the pg_replslot folder in order to avoid working directly on the wals of the instance. 

What do you think about this proposal?

Regards 

Fabrice


On Mon, Oct 2, 2023 at 12:06 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
Hi,

## Fabrice Chapuis (fabrice636861@gmail.com):

> on the other hand there are 2 slots for logical replication which display
> status extended. I don't understand why given that the confirmed_flush_lsn
> field that is up to date. The restart_lsn remains frozen, for what reason?

There you have it - "extended" means "holding wal". And as long as the
restart_lsn does not advance, checkpointer cannot free any wal beyond
that lsn. My first idea would be some long-running (or huge) transaction
which is in process (active or still being streamed). I'd recommend
looking into what the clients on these slots are doing.

Regards,
Christoph

--
Spare Space

Re: wal recycling problem

От
Christoph Moench-Tegeder
Дата:
## Fabrice Chapuis (fabrice636861@gmail.com):

> From a conceptual point of view I think that specific wals per subscription
> should be used and stored in the pg_replslot folder in order to avoid
> working directly on the wals of the instance.
> What do you think about this proposal?

I think that would open a wholly new can of worms.
The most obvious point here is: that WAL is primarily generated for
the operation of the database itself - it's our kind of transaction
log, or "Redo Log" in other systems' lingo. Replication (be it physical
or logical) is a secondary purpose (an obvious and important one, but
still secondary).
How would you know which part of WAL is needed for any specific
replication slot? You'd have to decode and filter it, and already
you're back at square one. How would you handle multiple replications
for the same table (in the same publication, or even over multiple
(overlapping) publications) - do you multiply the WAL?

For now, we have "any replication using replication slots, be it logical
or physical replication, retains WAL up to max_slot_wal_keep_size
(or "unlimited" if not set - and on PostgreSQL 12 and before); and you
need to monitor the state of your replication slots", which is a
totally usabe rule, I think.

Regards,
Christoph

-- 
Spare Space



Re: wal recycling problem

От
Fabrice Chapuis
Дата:
Thanks for your feedback
> How would you know which part of WAL is needed for any specific
replication slot?
change are captured for each published table and written twice,  once in the current wal and once in the slot-specific wal
> How would you handle multiple replications
for the same table 
added information about from which publication a table belongs is entered in the wal slot
> be it logical or physical replication, retains WAL up to max_slot_wal_keep_size
ok but if max_slot_wal_keep_size is exceeded the changes are lost and all of the replicated tables must be resynchronized

Regards

Fabrice

On Sun, Oct 8, 2023 at 3:57 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
## Fabrice Chapuis (fabrice636861@gmail.com):

> From a conceptual point of view I think that specific wals per subscription
> should be used and stored in the pg_replslot folder in order to avoid
> working directly on the wals of the instance.
> What do you think about this proposal?

I think that would open a wholly new can of worms.
The most obvious point here is: that WAL is primarily generated for
the operation of the database itself - it's our kind of transaction
log, or "Redo Log" in other systems' lingo. Replication (be it physical
or logical) is a secondary purpose (an obvious and important one, but
still secondary).
How would you know which part of WAL is needed for any specific
replication slot? You'd have to decode and filter it, and already
you're back at square one. How would you handle multiple replications
for the same table (in the same publication, or even over multiple
(overlapping) publications) - do you multiply the WAL?

For now, we have "any replication using replication slots, be it logical
or physical replication, retains WAL up to max_slot_wal_keep_size
(or "unlimited" if not set - and on PostgreSQL 12 and before); and you
need to monitor the state of your replication slots", which is a
totally usabe rule, I think.

Regards,
Christoph

--
Spare Space