Re: [HACKERS] SUBSCRIPTION command hangs up, segfault occurs in theserver process and to cancel the execution.
От | Fujii Masao |
---|---|
Тема | Re: [HACKERS] SUBSCRIPTION command hangs up, segfault occurs in theserver process and to cancel the execution. |
Дата | |
Msg-id | CAHGQGwH=DaJXXd7gQkFA9XTzdY9jLhs3xt2-aYf=jHMhFZsL7w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] SUBSCRIPTION command hangs up, segfault occurs in theserver process and to cancel the execution. (Masahiko Sawada <sawada.mshk@gmail.com>) |
Список | pgsql-hackers |
On Wed, Feb 22, 2017 at 1:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Tue, Feb 21, 2017 at 7:11 AM, nuko yokohama <nuko.yokohama@gmail.com> wrote: >> Hi. >> While verifying the logical replication of PostgreSQL 10-devel, I found the >> following problem. Thanks for the problem report! This problem was also reported before. https://www.postgresql.org/message-id/A1E9CB90-1FAC-4CAD-8DBA-9AA62A6E97C5@postgrespro.ru >> * When you run the SUBSCRIPTION command against a table in the same >> PostgreSQL server, hang up. >> * Canceling the hung SUBSCRIPTION command with CTRL + C causes a server >> process occurs Segfault, and the PostgreSQL server to restart. >> >> -------- >> [Steps to Reproduce] >> >> $ createdb db1 -U postgres >> $ createdb db2 -U postgres >> $ psql -U postgres db1 -c "CREATE TABLE test (id int primary key, data >> text)" >> $ psql -U postgres db2 -c "CREATE TABLE test (id int primary key, data >> text)" >> $ psql -U postgres db1 -c "CREATE PUBLICATION pub_db1_test FOR TABLE test" >> $ psql -U postgres db2 -c "CREATE SUBSCRIPTION sub_db2_test CONNECTION >> 'dbname=db1 port=5432 user=postgres' PUBLICATION pub_db1_test" >> >> The SUBSCRIPTION command does not end, it hangs up. >> At this time, the following logs are output in the server log. >> >> 2017-02-21 06:58:05.082 JST [22060] LOG: logical decoding found initial >> starting point at 0/1C06660 >> 2017-02-21 06:58:05.082 JST [22060] DETAIL: 1 transaction needs to finish. >> >> Suspending psql (running SUBSCRIPTION) with CTRL + C causes a Segfault in >> the server process. >> At this time, the following message is output to the server log. >> >> 2017-02-21 07:01:00.246 JST [22059] ERROR: canceling statement due to user >> request >> 2017-02-21 07:01:00.246 JST [22059] STATEMENT: CREATE SUBSCRIPTION >> sub_db2_test CONNECTION 'dbname=db1 port=5432 user=postgres' PUBLICATION >> pub_db1_test >> 2017-02-21 07:01:01.006 JST [21445] LOG: server process (PID 22060) was >> terminated by signal 11: Segmentation fault >> 2017-02-21 07:01:01.007 JST [21445] LOG: terminating any other active >> server processes >> >> -------- >> [Environment] >> >> postgres=# SELECT version(); >> version >> ------------------------------------------------------------------------------------------------------------ >> PostgreSQL 10devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 >> 20140120 (Red Hat 4.8.2-16), 64-bit >> (1 row) >> >> postgres=# SHOW wal_level; >> wal_level >> ----------- >> logical >> (1 row) >> >> postgres=# SHOW max_wal_senders;; >> max_wal_senders >> ----------------- >> 10 >> (1 row) >> >> postgres=# SHOW max_replication_slots; >> max_replication_slots >> ----------------------- >> 10 >> (1 row) >> >> postgres=# TABLE pg_hba_file_rules ; >> line_number | type | database | user_name | address | >> netmask | auth_method | options | error >> -------------+-------+---------------+------------+-----------+-----------------------------------------+-------------+---------+------- >> 2 | local | {all} | {all} | | >> | trust | | >> 4 | host | {all} | {all} | 127.0.0.1 | >> 255.255.255.255 | trust | | >> 6 | host | {all} | {all} | ::1 | >> ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff | trust | | >> 9 | local | {replication} | {postgres} | | >> | trust | | >> (4 rows) >> >> > > As log message says, the reason why CREATE SUBSCRIPTION hangs up is > that it's waiting for another transaction finish. > > Regarding SEGV, it occurs by wal sender process creating replication > slot. When SEV occurs the backtrace is as follows. > > (gdb) bt > #0 0x00000000006e3d47 in resetStringInfo (str=0xea1c60) at stringinfo.c:96 > #1 0x00000000007e4738 in ProcessRepliesIfAny () at walsender.c:1347 > #2 0x00000000007e42c5 in WalSndWaitForWal (loc=23400264) at walsender.c:1142 > #3 0x00000000007e38f6 in logical_read_xlog_page (state=0x2732008, > targetPagePtr=23396352, reqLen=3912, targetRecPtr=23400240, > cur_page=0x2733d58 "\225\320\005", pageTLI=0x27328b4) at > walsender.c:717 > #4 0x00000000005458c8 in ReadPageInternal (state=0x2732008, > pageptr=23396352, reqLen=3912) at xlogreader.c:556 > #5 0x0000000000545068 in XLogReadRecord (state=0x2732008, > RecPtr=23400240, errormsg=0x7fff7cf34cf8) at xlogreader.c:255 > #6 0x00000000007d09db in DecodingContextFindStartpoint > (ctx=0x2731d48) at logical.c:432 > #7 0x00000000007e3a87 in CreateReplicationSlot (cmd=0x26edfa0) at > walsender.c:796 > #8 0x00000000007e45ae in exec_replication_command > (cmd_string=0x268b5f8 "CREATE_REPLICATION_SLOT \"sub_db2_test\" > LOGICAL pgoutput") at walsender.c:1272 > #9 0x0000000000846783 in PostgresMain (argc=1, argv=0x2697f38, > dbname=0x2697e68 "db1", username=0x2697e40 "masahiko") at > postgres.c:4064 > #10 0x00000000007b34e0 in BackendRun (port=0x268f4b0) at postmaster.c:4310 > #11 0x00000000007b2c60 in BackendStartup (port=0x268f4b0) at postmaster.c:3982 > #12 0x00000000007af2e0 in ServerLoop () at postmaster.c:1722 > #13 0x00000000007ae9a4 in PostmasterMain (argc=5, argv=0x266c020) at > postmaster.c:1330 > #14 0x00000000006f487a in main (argc=5, argv=0x266c020) at main.c:228 > > I guess that the cause of SEGV is that reply_message is reset without > initialize by initStringInfo. The reply_message and other buffers are > initialized at beggning of WalSndLoop but resetStringInfo can be > called by WalSndWaitForWal->ProcessRepliesIfAny without calling > WalSndLoop when creating replication slot. To fix it, we can > initialize reply_message in exec_replication_command like follows. Yes. I pushed the patch which makes walsender always initialize all three buffers. Regards, -- Fujii Masao
В списке pgsql-hackers по дате отправления: