Обсуждение: BUG #17296: replication slot self-removed after created

Поиск
Список
Период
Сортировка

BUG #17296: replication slot self-removed after created

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17296
Logged by:          Anna
Email address:      podkina@gmail.com
PostgreSQL version: 12.8
Operating system:   Oracle Linux Server release 8.4
Description:

Hello!
I've met the problem with the cluster by patroni.
Cluster configuration: 3 nodes, etcd downstairs.
I tried to set up the Standby cluster through the replication slot as I did
it before on the same version of patroni/PostgreSQL, but different OS and
hardware environment.
(not sure it's important)

Anyway, I create the replication slot by the command (psql, locally on the
host):
SELECT pg_create_physical_replication_slot('patroni12_standby', true);

Immediately after I see my slot in pg_replication_slots
Then in a few seconds, there is no slot.

On the other side (on Standby PostgreSQL) I see in a log:
2021-11-22 13:15:35 MSK [96831]:
user=replicator,db=[unknown],app=[unknown],client=*** LOG: replication
connection authorized: user=replicator application_name=***.io
2021-11-22 13:15:35 MSK [96831]:
user=replicator,db=[unknown],app=***.io,client=*** ERROR: replication slot
"patroni12_standby" does not exist
2021-11-22 13:15:35 MSK [96831]:
user=replicator,db=[unknown],app=***.io,client=*** STATEMENT:
START_REPLICATION SLOT "patroni12_standby" 34A9/EA000000 TIMELINE 8
2021-11-22 13:15:35 MSK [96831]:
user=replicator,db=[unknown],app=***.io,client=*** LOG: disconnection:
session time: 0:00:00.049 user=replicator database= host=*** port=35772
### here I create replication slot on master node
2021-11-22 13:15:56 MSK [300984]:
user=replicator,db=[unknown],app=[unknown],client=*** LOG: replication
connection authorized: user=replicator application_name=***.io
2021-11-22 13:15:56 MSK [300984]:
user=replicator,db=[unknown],app=***.io,client=*** ERROR: requested WAL
segment 00000008000034A9000000EA has already been removed
2021-11-22 13:15:56 MSK [300984]:
user=replicator,db=[unknown],app=***.io,client=*** STATEMENT:
START_REPLICATION SLOT "patroni12_standby" 34A9/EA000000 TIMELINE 8
2021-11-22 13:15:56 MSK [300984]:
user=replicator,db=[unknown],app=***.io,client=*** LOG: disconnection:
session time: 0:00:00.056 user=replicator database= host=*** port=54038
### The WAL segment error doesn't matter, but It show that the slot exists.
Then I don't do anything and next log/error about "slot does not exist"
again
2021-11-22 13:16:01 MSK [300998]:
user=replicator,db=[unknown],app=[unknown],client=*** LOG: replication
connection authorized: user=replicator application_name=***.io
2021-11-22 13:16:01 MSK [300998]:
user=replicator,db=[unknown],app=***.io,client=*** ERROR: replication slot
"patroni12_standby" does not exist



Also, I tried to create a slot without a readiness standby node but the
behavior was the same. The slot exists for short time and then it's
disappearing.

All versions:
OS: Oracle Linux Server release 8.4
PG: PostgreSQL 12.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1
20200928 (Red Hat 8.4.1-1), 64-bit
Orchestrator: patroni 2.1.1 / patronictl version 2.1.1 / etcd 3.4.16
PGconfiguration:
    autovacuum_analyze_scale_factor: 0.01
    autovacuum_vacuum_scale_factor: 0.01
    checkpoint_completion_target: 0.7
    default_statistics_target: 100
    effective_cache_size: 200GB
    effective_io_concurrency: 250
    hot_standby: 'on'
    hot_standby_feedback: 'on'
    log_autovacuum_min_duration: 0
    log_checkpoints: true
    log_connections: true
    log_disconnections: true
    log_line_prefix: '%t [%p]: user=%u,db=%d,app=%a,client=%h '
    log_lock_waits: true
    log_min_duration_statement: 1000
    log_statement: ddl
    log_temp_files: 100000
    maintenance_work_mem: 4GB
    max_connections: 1000
    max_parallel_maintenance_workers: 4
    max_parallel_workers: 12
    max_parallel_workers_per_gather: 4
    max_replication_slots: 10
    max_wal_senders: 10
    max_wal_size: 4GB
    max_worker_processes: 128
    min_wal_size: 1GB
    random_page_cost: 1.0
    seq_page_cost: 1.0
    shared_buffers: 156GB
    wal_buffers: 16MB
    wal_keep_segments: 8
    wal_level: replica
    work_mem: 128MB


Ready to provide you with any extended information.
I hope that's not a bug, but I couldn't find anything about it on net

Best regards,
Anna


Re: BUG #17296: replication slot self-removed after created

От
Masahiko Sawada
Дата:
On Mon, Nov 22, 2021 at 10:45 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      17296
> Logged by:          Anna
> Email address:      podkina@gmail.com
> PostgreSQL version: 12.8
> Operating system:   Oracle Linux Server release 8.4
> Description:
>
> Hello!
> I've met the problem with the cluster by patroni.
> Cluster configuration: 3 nodes, etcd downstairs.
> I tried to set up the Standby cluster through the replication slot as I did
> it before on the same version of patroni/PostgreSQL, but different OS and
> hardware environment.
> (not sure it's important)
>
> Anyway, I create the replication slot by the command (psql, locally on the
> host):
> SELECT pg_create_physical_replication_slot('patroni12_standby', true);
>
> Immediately after I see my slot in pg_replication_slots
> Then in a few seconds, there is no slot.

Other than temporary replication slots, as far as I know, there is no
functionality in PostgreSQL that automatically drops replication
slots. Given that you created the persistent replication slot
‘patroni12_standby’, it’s likely that other components in your system
dropped it. It might be a good idea to set log_statement = ‘all’ and
log_replication_commands = ‘on’ in order to identify who dropped the
replication slot.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/