Обсуждение: Creating many tables gets logical replication stuck

Поиск
Список
Период
Сортировка

Creating many tables gets logical replication stuck

От
Laurenz Albe
Дата:
Reproducer on 12.4:

On the primary:

CREATE TABLE repli (id bigint PRIMARY KEY, val text NOT NULL);
CREATE PUBLICATION repli_pub FOR TABLE repli;

On the standby:

CREATE TABLE repli (id bigint PRIMARY KEY, val text NOT NULL);
CREATE SUBSCRIPTION repli_sub CONNECTION '...' PUBLICATION repli_pub;

Logical replication works fine.

Now connect to the primary and run:

BEGIN;
SELECT format('CREATE TABLE table%s (id SERIAL PRIMARY KEY, somename text UNIQUE);', foo) FROM generate_series(1,13200)
ASfoo \gexec
 

13200 tables are created.
Don't commit the transaction yet and verify that logical replication is still working.

Now commit the transaction; logical replication stops working.

The log on the primary has lots of:

2020-08-21 12:41:47.899 CEST [25222] LOG:  connection received: host=[local]
2020-08-21 12:41:47.900 CEST [25222] LOG:  replication connection authorized: user=postgres application_name=repli_sub
2020-08-21 12:41:47.901 CEST [25222] ERROR:  replication slot "repli_sub" is active for PID 23911

The standby has

2020-08-21 12:41:47.897 CEST [25221] LOG:  logical replication apply worker for subscription "repli_sub" has started
2020-08-21 12:41:47.901 CEST [25221] ERROR:  could not start WAL streaming: ERROR:  replication slot "repli_sub" is
activefor PID 23911
 
2020-08-21 12:41:47.903 CEST [22901] LOG:  background worker "logical replication worker" (PID 25221) exited with exit
code1
 

When I try a fast shutdown on the primary server, the WAL sender just doesn't stop,
it can only be killed with SIGKILL.

I set "wal_sender_timeout = 6000s" and "log_min_messages = debug3" on the primary
and "wal_receiver_timeout = 6000s" on the standby and restarted the servers.

I see messages like this for the WAL sender:

2020-08-21 13:53:34.321 CEST [33594] LOG:  connection received: host=[local]
2020-08-21 13:53:34.321 CEST [33594] DEBUG:  postmaster child[33594]: starting with (
2020-08-21 13:53:34.321 CEST [33594] DEBUG:     postgres
2020-08-21 13:53:34.321 CEST [33594] DEBUG:  )
2020-08-21 13:53:34.321 CEST [33594] DEBUG:  InitPostgres
2020-08-21 13:53:34.321 CEST [33594] LOG:  replication connection authorized: user=postgres application_name=repli_sub
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  received replication command: IDENTIFY_SYSTEM
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  received replication command: START_REPLICATION SLOT "repli_sub" LOGICAL
28/48643248(proto_version '1', publication_names '"repli_pub"')
 
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  cannot stream from 28/48643248, minimum is 28/48705180, forwarding
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  find_in_dynamic_libpath: trying "/usr/pgsql-12/lib/pgoutput"
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  find_in_dynamic_libpath: trying "/usr/pgsql-12/lib/pgoutput.so"
2020-08-21 13:53:34.326 CEST [33594] LOG:  starting logical decoding for slot "repli_sub"
2020-08-21 13:53:34.326 CEST [33594] DETAIL:  Streaming transactions committing after 28/48705180, reading WAL from
28/322FEDB8.
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  switched to timeline 1 valid until 0/0
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  sending replication keepalive
2020-08-21 13:53:34.326 CEST [33594] LOG:  logical decoding found consistent point at 28/322FEDB8
2020-08-21 13:53:34.326 CEST [33594] DETAIL:  There are no running transactions.
2020-08-21 13:53:34.326 CEST [33594] DEBUG:  write 28/48705180 flush 28/48705180 apply 28/48705180 reply_time
2020-08-2113:53:34.326751+02
 
2020-08-21 13:53:34.335 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:34.349 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:34.362 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:34.377 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:34.391 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:34.406 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
[...]
spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:38.388 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:38.405 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:38.423 CEST [33594] DEBUG:  spill 4096 changes in XID 191852506 to disk
2020-08-21 13:53:38.438 CEST [33594] DEBUG:  purged committed transactions from 0 to 0, xmin: 191852506, xmax:
191852506
2020-08-21 13:53:38.438 CEST [33594] DEBUG:  xmin: 191852506, xmax: 191852506, oldest running: 191852506, oldest xmin:
191852506
2020-08-21 13:53:38.439 CEST [33594] DEBUG:  purged committed transactions from 0 to 0, xmin: 191852506, xmax:
191852506
2020-08-21 13:53:38.439 CEST [33594] DEBUG:  xmin: 191852506, xmax: 191852506, oldest running: 191852506, oldest xmin:
191852506
2020-08-21 13:53:38.501 CEST [33594] DEBUG:  found top level transaction 191852506, with catalog changes
2020-08-21 13:53:38.501 CEST [33594] DEBUG:  adding a new snapshot to 191852506 at 28/48705180
2020-08-21 13:53:38.766 CEST [33594] DEBUG:  spill 1953 changes in XID 191852506 to disk
2020-08-21 13:54:12.207 CEST [33594] DEBUG:  restored 4096/976801 changes from disk
2020-08-21 13:54:48.343 CEST [33594] DEBUG:  restored 4096/976801 changes from disk
2020-08-21 13:55:25.801 CEST [33594] DEBUG:  restored 4096/976801 changes from disk
2020-08-21 13:56:00.470 CEST [33594] DEBUG:  restored 4096/976801 changes from disk
2020-08-21 13:56:33.711 CEST [33594] DEBUG:  restored 4096/976801 changes from disk
[...]

This just keeps going with a message roughly every 30 seconds.

A typical stack trace of the WAL sender looks like this:

#0  0x00000000008c7d87 in hash_seq_search (status=status@entry=0x7ffc4e5446e0) at dynahash.c:1428
#1  0x00000000008b54c8 in RelfilenodeMapInvalidateCallback (arg=<optimized out>, relid=386381) at relfilenodemap.c:79
#2  0x00000000008a6786 in LocalExecuteInvalidationMessage (msg=0x7ff297f62b38) at inval.c:595
#3  0x0000000000750ba6 in ReorderBufferExecuteInvalidations (rb=<optimized out>, txn=<optimized out>) at
reorderbuffer.c:2149
#4  ReorderBufferCommit (rb=0x1b270a0, xid=xid@entry=191852506, commit_lsn=173014012288, end_lsn=<optimized out>, 
    commit_time=commit_time@entry=651321281503986, origin_id=origin_id@entry=0, origin_lsn=0) at reorderbuffer.c:1770
#5  0x0000000000746a0a in DecodeCommit (xid=191852506, parsed=0x7ffc4e544aa0, buf=0x7ffc4e544c50, ctx=0x1b11120) at
decode.c:640
#6  DecodeXactOp (ctx=0x1b11120, buf=buf@entry=0x7ffc4e544c50) at decode.c:248
#7  0x0000000000746e22 in LogicalDecodingProcessRecord (ctx=0x1b11120, record=0x1b11398) at decode.c:117
#8  0x00000000007581e5 in XLogSendLogical () at walsender.c:2848
#9  0x000000000075a3f3 in WalSndLoop (send_data=send_data@entry=0x758180 <XLogSendLogical>) at walsender.c:2199
#10 0x000000000075b12c in StartLogicalReplication (cmd=0x7ffc4e544d20) at walsender.c:1129
#11 exec_replication_command (
    cmd_string=cmd_string@entry=0x1a425f0 "START_REPLICATION SLOT \"repli_sub\" LOGICAL 28/48643248 (proto_version '1',
publication_names'\"repli_pub\"')") at walsender.c:1545
 
#12 0x00000000007a72d8 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x1a99298, dbname=<optimized out>,
username=<optimizedout>)
 
    at postgres.c:4243
#13 0x0000000000734ef2 in BackendRun (port=0x1a90520) at postmaster.c:4448
#14 BackendStartup (port=0x1a90520) at postmaster.c:4139
#15 ServerLoop () at postmaster.c:1704
#16 0x0000000000735df5 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1a3c1e0) at postmaster.c:1377
#17 0x00000000004cd6a1 in main (argc=3, argv=0x1a3c1e0) at main.c:228

perf top --no-children --call-graph=fp --dsos=/usr/pgsql-12/bin/postgres

+    8.10%  [.] hash_seq_search
+    4.19%  [.] LocalExecuteInvalidationMessage
+    3.49%  [.] hash_search_with_hash_value
+    3.37%  [.] CatCacheInvalidate
+    1.65%  [.] CallSyscacheCallbacks
+    0.93%  [.] uint32_hash
+    0.82%  [.] ReorderBufferCommit
+    0.79%  [.] hash_search
+    0.76%  [.] RelfilenodeMapInvalidateCallback
     0.63%  [.] InvalidateCatalogSnapshot
+    0.62%  [.] SysCacheInvalidate

What could be causing this?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Creating many tables gets logical replication stuck

От
Achilleas Mantzios
Дата:
Dear Laurenz thank you for your analysis and report.

On 21/8/20 4:00 μ.μ., Laurenz Albe wrote:
> Reproducer on 12.4:
This is identical problem with this report here :
https://www.postgresql.org/message-id/6fa054d8-ad14-42a2-8926-5d79c97ecd65%40matrix.gatewaynet.com
> Yours,
> Laurenz Albe


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt




Re: Creating many tables gets logical replication stuck

От
Keisuke Kuroda
Дата:
Hi All,

There was a similar problem in this discussion:
  Logical decoding CPU-bound w/ large number of tables

https://www.postgresql.org/message-id/flat/CAHoiPjzea6N0zuCi%3D%2Bf9v_j94nfsy6y8SU7-%3Dbp4%3D7qw6_i%3DRg%40mail.gmail.com

> RelfilenodeMapHash from 1024 entries to 64.
The above changes reduced the performance impact.

However, I think the problem that there are too
many invalidations of RelfilenodeMapHash still remains.
As you report, when many tables are created/dropped/truncated,
The walsender process can get stuck.

-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com



Re: Creating many tables gets logical replication stuck

От
Achilleas Mantzios
Дата:
Hello Keisuke
On 25/8/20 9:50 π.μ., Keisuke Kuroda wrote:
> Hi All,
>
> There was a similar problem in this discussion:
>    Logical decoding CPU-bound w/ large number of tables
>
https://www.postgresql.org/message-id/flat/CAHoiPjzea6N0zuCi%3D%2Bf9v_j94nfsy6y8SU7-%3Dbp4%3D7qw6_i%3DRg%40mail.gmail.com
>
>> RelfilenodeMapHash from 1024 entries to 64.
> The above changes reduced the performance impact.
>
> However, I think the problem that there are too
> many invalidations of RelfilenodeMapHash still remains.
> As you report, when many tables are created/dropped/truncated,
Right! IIRC also a massive drop created the very same problem during the latest attempts for reproduction but I did not
focuson this scenario.
 
> The walsender process can get stuck.
>


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt