[HACKERS] Logical replication: stuck spinlock at ReplicationSlotRelease
| От | Albe Laurenz | 
|---|---|
| Тема | [HACKERS] Logical replication: stuck spinlock at ReplicationSlotRelease | 
| Дата | |
| Msg-id | A737B7A37273E048B164557ADEF4A58B53A5C8E8@ntex2010i.host.magwien.gv.at обсуждение исходный текст | 
| Ответы | Re: [HACKERS] Logical replication: stuck spinlock atReplicationSlotRelease | 
| Список | pgsql-hackers | 
While playing with HEAD as of d14c85ed,
I ran into the following:
CREATE DATABASE source;
CREATE DATABASE recipient;
\c source
CREATE TABLE repli(id integer PRIMARY KEY, val text NOT NULL);
INSERT INTO repli VALUES (1, 'one');
CREATE PUBLICATION repsend FOR TABLE repli;
SELECT pg_create_logical_replication_slot('reprec', 'pgoutput');
\c recipient
CREATE TABLE repli(id integer PRIMARY KEY, val text NOT NULL);
CREATE SUBSCRIPTION reprec CONNECTION 'port=5432 dbname=source user=postgres'  PUBLICATION repsend WITH
(create_slot='false',slot_name='reprec');
 
SELECT * FROM repli;
┌────┬─────┐
│ id │ val │
├────┼─────┤
│  1 │ one │
└────┴─────┘
(1 row)
Looks good.
Now I try to produce a replication conflict:
INSERT INTO repli VALUES (2, 'block');
\c source
INSERT INTO repli VALUES (2, 'two');
After a while the server crashes, and the log looks like this:
2017-06-21 14:55:07.002 CEST [23096] ERROR:  duplicate key value violates unique constraint "repli_pkey"
2017-06-21 14:55:07.002 CEST [23096] DETAIL:  Key (id)=(2) already exists.
2017-06-21 14:55:07.003 CEST [23070] LOG:  worker process: logical replication worker for subscription 28681 (PID
23096)exited with exit code 1
 
2017-06-21 14:55:07.006 CEST [23117] LOG:  logical replication apply worker for subscription "reprec" has started
DEBUG:  received replication command: IDENTIFY_SYSTEM
DEBUG:  received replication command: START_REPLICATION SLOT "reprec" LOGICAL 0/0 (proto_version '1', publication_names
'"repsend"')
2017-06-21 14:55:07.008 CEST [23118] LOG:  starting logical decoding for slot "reprec"
2017-06-21 14:55:07.008 CEST [23118] DETAIL:  streaming transactions committing after 0/51A67B8, reading WAL from
0/51A66C8
LOG:  starting logical decoding for slot "reprec"
DETAIL:  streaming transactions committing after 0/51A67B8, reading WAL from 0/51A66C8
2017-06-21 14:55:07.008 CEST [23118] LOG:  logical decoding found consistent point at 0/51A66C8
2017-06-21 14:55:07.008 CEST [23118] DETAIL:  There are no running transactions.
DEBUG:  sending replication keepalive
LOG:  logical decoding found consistent point at 0/51A66C8
DETAIL:  There are no running transactions.
2017-06-21 14:55:07.009 CEST [23117] ERROR:  duplicate key value violates unique constraint "repli_pkey"
2017-06-21 14:55:07.009 CEST [23117] DETAIL:  Key (id)=(2) already exists.
2017-06-21 14:55:07.010 CEST [23070] LOG:  worker process: logical replication worker for subscription 28681 (PID
23117)exited with exit code 1
 
2017-06-21 14:55:12.019 CEST [23122] LOG:  logical replication apply worker for subscription "reprec" has started
DEBUG:  received replication command: IDENTIFY_SYSTEM
2017-06-21 14:55:12.021 CEST [23124] LOG:  starting logical decoding for slot "reprec"
2017-06-21 14:55:12.021 CEST [23124] DETAIL:  streaming transactions committing after 0/51A67B8, reading WAL from
0/51A66C8
DEBUG:  received replication command: START_REPLICATION SLOT "reprec" LOGICAL 0/0 (proto_version '1', publication_names
'"repsend"')
2017-06-21 14:55:12.022 CEST [23124] LOG:  logical decoding found consistent point at 0/51A66C8
2017-06-21 14:55:12.022 CEST [23124] DETAIL:  There are no running transactions.
LOG:  starting logical decoding for slot "reprec"
DETAIL:  streaming transactions committing after 0/51A67B8, reading WAL from 0/51A66C8
DEBUG:  sending replication keepalive
LOG:  logical decoding found consistent point at 0/51A66C8
DETAIL:  There are no running transactions.
2017-06-21 14:55:12.023 CEST [23122] ERROR:  duplicate key value violates unique constraint "repli_pkey"
2017-06-21 14:55:12.023 CEST [23122] DETAIL:  Key (id)=(2) already exists.
2017-06-21 14:55:12.024 CEST [23070] LOG:  worker process: logical replication worker for subscription 28681 (PID
23122)exited with exit code 1
 
2017-06-21 14:55:12.033 CEST [23124] LOG:  could not send data to client: Broken pipe
2017-06-21 14:55:12.033 CEST [23124] FATAL:  connection to client lost
2017-06-21 14:55:17.032 CEST [23133] LOG:  logical replication apply worker for subscription "reprec" has started
DEBUG:  received replication command: IDENTIFY_SYSTEM
DEBUG:  received replication command: START_REPLICATION SLOT "reprec" LOGICAL 0/0 (proto_version '1', publication_names
'"repsend"')
2017-06-21 14:57:24.552 CEST [23124] PANIC:  stuck spinlock detected at ReplicationSlotRelease, slot.c:394
2017-06-21 14:57:24.885 CEST [23070] LOG:  server process (PID 23124) was terminated by signal 6: Aborted
2017-06-21 14:57:24.885 CEST [23070] LOG:  terminating any other active server processes
2017-06-21 14:57:24.887 CEST [23134] LOG:  could not send data to client: Broken pipe
2017-06-21 14:57:24.890 CEST [23070] LOG:  all server processes terminated; reinitializing
Yours,
Laurenz Albe
		
	В списке pgsql-hackers по дате отправления: