(not a PG bug, please ignore) Re: PostgreSQL 11.3 hangs foreverduring DEALLOCATE statement

Поиск
Список
Период
Сортировка
От Tobias Gierke
Тема (not a PG bug, please ignore) Re: PostgreSQL 11.3 hangs foreverduring DEALLOCATE statement
Дата
Msg-id 34189fc1-f8e8-69fd-d8d4-d6dd1a7a6d19@voipfuture.com
обсуждение исходный текст
Ответ на PostgreSQL 11.3 hangs forever during DEALLOCATE statement  (Tobias Gierke <tobias.gierke@voipfuture.com>)
Список pgsql-bugs
Hi,

After more investigation it turned out that this was actually a deadlock 
across two machines caused by a lock-ordering issue.

Sorry for the noise,
Tobias


> Hi,
>
> We have a distributed system (all CentOS 7 with PG 11.3 compiled from 
> sources) that makes use of pg_fdw to forward changes to a local table 
> to a remote server.
>
> Today we noticed that one process got stuck with wait_event_type 
> 'Extension'. Attaching GDB to the process on the local server (see 
> further below) showed that the extension in question was actually 
> pg_fdw. Further inspection on the remote server showed that the 
> connection was stuck in a DEALLOCATE call that never returned.
>
> *** Local server ***
>
> Blocked statement:
>
> voip=# select 
>
pid,wait_event_type,wait_event,datid,datname,backend_start,xact_start,query_start,state_change,backend_xid,backend_xmin,backend_type,substring(query,1,50)

> from pg_stat_activity where state <> 'idle';
>   pid  | wait_event_type | wait_event | datid | datname | 
> backend_start         |          xact_start           | 
> query_start          |         state_change          | backend_xid | 
> backend_xmin |  backend_type  | substring
>
-------+-----------------+------------+-------+---------+-------------------------------+-------------------------------+-------------------------------+-------------------

>
> ------------+-------------+--------------+----------------+---------------------------------------------------- 
>
>  25301 | Extension       | Extension  | 16689 | voip    | 2019-06-11 
> 11:47:32.75992+02  | 2019-06-11 12:24:46.554381+02 | 2019-06-11 
> 12:24:46.55486+02  | 2019-06-11 12:24:46.55486+02  | 12198484 |     
> 12198483 | client backend | UPDATE importer.config SET value='f' WHERE 
> ....
>
>
> The table table updated in the query has a trigger function that 
> synchronizes the changes to a remote table:
>
>
>                      Table "importer.config"
>   Column  |  Type  | Collation | Nullable | Default
> ----------+--------+-----------+----------+------------------------------------------------- 
>
>  id       | bigint |           | not null | 
> nextval('importer.config_new_id_seq'::regclass)
>  param    | text   |           | not null |
>  value    | text   |           |          |
>  hostname | text   |           | not null |
>  filetype | text   |           | not null |
> Indexes:
>     "config_param_hostname_filetype_key" UNIQUE CONSTRAINT, btree 
> (param, hostname, filetype)
> Triggers:
>     importer_config_trigger AFTER INSERT OR DELETE OR UPDATE ON 
> importer.config FOR EACH STATEMENT EXECUTE PROCEDURE 
> notify_importer_config_change()
>     sync_importer_config AFTER INSERT OR DELETE OR UPDATE OR TRUNCATE 
> ON importer.config FOR EACH STATEMENT EXECUTE PROCEDURE 
> importer.trig_sync_importer_config()
>
> voip=# \dfS+ importer.trig_sync_importer_config()
>
> BEGIN
> DELETE FROM remote_qdc_importer.config;
> INSERT INTO remote_qdc_importer.config SELECT * FROM importer.config;
> RETURN NEW;
> END;
>
>
> Remote table definition:
>
> voip=# \d remote_qdc_importer.config;
>                   Foreign table "remote_qdc_importer.config"
>   Column  |  Type  | Collation | Nullable | Default |       FDW options
> ----------+--------+-----------+----------+---------+-------------------------- 
>
>  id       | bigint |           | not null |         | (column_name 'id')
>  param    | text   |           | not null |         | (column_name 
> 'param')
>  value    | text   |           |          |         | (column_name 
> 'value')
>  hostname | text   |           | not null |         | (column_name 
> 'hostname')
>  filetype | text   |           | not null |         | (column_name 
> 'filetype')
> Server: qdc
> FDW options: (schema_name 'importer', table_name 'config')
>
> Backtrace on the local server (PID 25301)
>
>
> #0  0x00007fe8a32cd463 in __epoll_wait_nocancel () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00007fe8a32cd463 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #1  0x000000000070c81e in WaitEventSetWaitBlock (nevents=1, 
> occurred_events=0x7ffd7bfc0fc0, cur_timeout=-1, set=0x24f1900) at 
> latch.c:1048
> #2  WaitEventSetWait (set=set@entry=0x24f1900, 
> timeout=timeout@entry=-1, 
> occurred_events=occurred_events@entry=0x7ffd7bfc0fc0, 
> nevents=nevents@entry=1,
>     wait_event_info=wait_event_info@entry=117440512) at latch.c:1000
> #3  0x000000000070cc57 in WaitLatchOrSocket (latch=0x7fe8967e9fc4, 
> wakeEvents=wakeEvents@entry=3, sock=26, timeout=timeout@entry=-1,
>     wait_event_info=wait_event_info@entry=117440512) at latch.c:385
> #4  0x00007fe89a9dbc03 in pgfdw_get_result (conn=0x2880170, 
> query=0x2348c90 "DELETE FROM importer.config") at connection.c:548
> #5  0x00007fe89a9d5a9e in execute_dml_stmt (node=0x24d4710, 
> node=0x24d4710) at postgres_fdw.c:3787
> #6  postgresIterateDirectModify (node=0x24d4710) at postgres_fdw.c:2553
> #7  0x000000000061fabc in ForeignNext (node=node@entry=0x24d4710) at 
> nodeForeignscan.c:52
> #8  0x00000000005fe2e5 in ExecScanFetch (recheckMtd=0x61f9f0 
> <ForeignRecheck>, accessMtd=0x61fa80 <ForeignNext>, node=0x24d4710) at 
> execScan.c:95
> #9  ExecScan (node=0x24d4710, accessMtd=0x61fa80 <ForeignNext>, 
> recheckMtd=0x61f9f0 <ForeignRecheck>) at execScan.c:162
> #10 0x00000000006186bb in ExecProcNode (node=0x24d4710) at 
> ../../../src/include/executor/executor.h:247
> #11 ExecModifyTable (pstate=0x24d3db0) at nodeModifyTable.c:2025
> #12 0x00000000005f638a in ExecProcNode (node=0x24d3db0) at 
> ../../../src/include/executor/executor.h:247
> #13 ExecutePlan (execute_once=<optimized out>, dest=0xccd060 
> <spi_printtupDR>, direction=<optimized out>, numberTuples=0, 
> sendTuples=false, operation=CMD_DELETE,
>     use_parallel_mode=<optimized out>, planstate=0x24d3db0, 
> estate=0x24d3850) at execMain.c:1723
> #14 standard_ExecutorRun (queryDesc=0x2aa39d0, direction=<optimized 
> out>, count=0, execute_once=<optimized out>) at execMain.c:364
> #15 0x00007fe89b23ff85 in pgss_ExecutorRun (queryDesc=0x2aa39d0, 
> direction=ForwardScanDirection, count=0, execute_once=<optimized out>) 
> at pg_stat_statements.c:892
> #16 0x0000000000626969 in _SPI_pquery (tcount=0, fire_triggers=true, 
> queryDesc=0x2aa39d0) at spi.c:2501
> #17 _SPI_execute_plan (plan=plan@entry=0x1e73820, 
> paramLI=paramLI@entry=0x0, snapshot=snapshot@entry=0x0, 
> crosscheck_snapshot=crosscheck_snapshot@entry=0x0,
>     read_only=read_only@entry=false, 
> fire_triggers=fire_triggers@entry=true, tcount=tcount@entry=0) at 
> spi.c:2263
> #18 0x0000000000626d68 in SPI_execute_plan_with_paramlist 
> (plan=0x1e73820, params=0x0, read_only=<optimized out>, tcount=0) at 
> spi.c:540
> #19 0x00007fe80c6b16ef in exec_stmt_execsql 
> (estate=estate@entry=0x7ffd7bfc18e0, stmt=stmt@entry=0x1e67470) at 
> pl_exec.c:4115
> #20 0x00007fe80c6b2deb in exec_stmt (stmt=0x1e67470, 
> estate=0x7ffd7bfc18e0) at pl_exec.c:1986
> #21 exec_stmts (estate=0x7ffd7bfc18e0, stmts=<optimized out>) at 
> pl_exec.c:1877
> #22 0x00007fe80c6b4f42 in exec_stmt_block 
> (estate=estate@entry=0x7ffd7bfc18e0, block=0x1e70e20) at pl_exec.c:1818
> #23 0x00007fe80c6b5690 in plpgsql_exec_trigger 
> (func=func@entry=0x1cb2b98, trigdata=0x7ffd7bfc20f0) at pl_exec.c:958
> #24 0x00007fe80c6a9382 in plpgsql_call_handler (fcinfo=0x7ffd7bfc1c70) 
> at pl_handler.c:254
> #25 0x00000000005d28c8 in ExecCallTriggerFunc 
> (trigdata=trigdata@entry=0x7ffd7bfc20f0, tgindx=tgindx@entry=1, 
> finfo=0x2bac7c0, finfo@entry=0x2bac790,
>     instr=instr@entry=0x0, 
> per_tuple_context=per_tuple_context@entry=0x27e73a0) at trigger.c:2412
> #26 0x00000000005d3f1d in AfterTriggerExecute (trigdesc=0x2bac5d0,
> trigdesc=0x2bac5d0, trig_tuple_slot2=0x0, trig_tuple_slot1=0x0, 
> per_tuple_context=0x27e73a0, instr=0x0,
>     finfo=0x2bac790, rel=0x1d1f5f8, event=0x2a6127c) at trigger.c:4359
> #27 afterTriggerInvokeEvents (events=events@entry=0x2c28790, 
> firing_id=1, estate=estate@entry=0x2bac280, 
> delete_ok=delete_ok@entry=false) at trigger.c:4550
> #28 0x00000000005d9397 in AfterTriggerEndQuery 
> (estate=estate@entry=0x2bac280) at trigger.c:4860
> #29 0x00000000005f6609 in standard_ExecutorFinish 
> (queryDesc=0x2c2f6a0) at execMain.c:439
> #30 0x00007fe89b23fe95 in pgss_ExecutorFinish (queryDesc=0x2c2f6a0) at 
> pg_stat_statements.c:915
> #31 0x000000000072e6e8 in ProcessQuery (plan=<optimized out>,
>     sourceText=0x2c2f610 "UPDATE importer.config SET value='f' WHERE 
> param='active' AND hostname = 'dc-int' AND filetype='ssa5'", 
> params=0x0, queryEnv=0x0,
>     dest=0xccd120 <donothingDR>, completionTag=0x7ffd7bfc2600 "UPDATE 
> 1") at pquery.c:205
> #32 0x000000000072e8c2 in PortalRunMulti 
> (portal=portal@entry=0x1d0f6a0, isTopLevel=isTopLevel@entry=true, 
> setHoldSnapshot=setHoldSnapshot@entry=false,
>     dest=0xccd120 <donothingDR>, dest@entry=0x1c88a70, 
> altdest=0xccd120 <donothingDR>, altdest@entry=0x1c88a70,
>     completionTag=completionTag@entry=0x7ffd7bfc2600 "UPDATE 1") at 
> pquery.c:1286
> #33 0x000000000072f38d in PortalRun (portal=portal@entry=0x1d0f6a0, 
> count=count@entry=1, isTopLevel=isTopLevel@entry=true, 
> run_once=<optimized out>,
>     dest=dest@entry=0x1c88a70, altdest=altdest@entry=0x1c88a70, 
> completionTag=completionTag@entry=0x7ffd7bfc2600 "UPDATE 1") at 
> pquery.c:799
> #34 0x000000000072ce8d in exec_execute_message (max_rows=1, 
> portal_name=0x1c88660 "") at postgres.c:2036
> #35 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1cb1358, 
> dbname=0x1cb1148 "voip", username=<optimized out>) at postgres.c:4245
> #36 0x000000000047e313 in BackendRun (port=0x1caedd0) at 
> postmaster.c:4358
> #37 BackendStartup (port=0x1caedd0) at postmaster.c:4030
> #38 ServerLoop () at postmaster.c:1707
>
> *** Remote server ***
>
>
> postgres=# select 
>
pid,wait_event_type,wait_event,datid,datname,backend_start,xact_start,query_start,state_change,backend_xid,backend_xmin,backend_type,substring(query,1,50)

> from pg_stat_activity where state <> 'idle';
>   pid  | wait_event_type |  wait_event   | datid |    datname 
> |         backend_start         |          xact_start | 
> query_start          |         s
> tate_change          | backend_xid | backend_xmin |  backend_type 
> |                     substring
>
-------+-----------------+---------------+-------+---------------+-------------------------------+-------------------------------+-------------------------------+----------

>
>
---------------------+-------------+--------------+----------------+----------------------------------------------------

>
>  25536 | Lock            | transactionid | 16401 | datacollector | 
> 2019-06-11 12:24:46.567576+02 | 2019-06-11 12:24:46.607314+02 | 
> 2019-06-11 12:24:46.615009+02 | 2019-06-11 12:24:46.61501+02 | 4292464 
> |      4292463 | client backend | DELETE FROM importer.config
>  10183 | Client          | ClientRead    | 16401 | datacollector | 
> 2019-06-11 11:58:09.340121+02 | 2019-06-11 12:24:45.870476+02 | 
> 2019-06-11 12:24:46.711237+02 | 2019-06-11 12:24:46.711292+02 |     
> 4292463 |      4292463 | client backend | DEALLOCATE pgsql_fdw_prep_178
> (3 rows)
>
> postgres=# select * from pg_locks where pid=25536 and granted=false;
>    locktype    | database | relation | page | tuple | virtualxid | 
> transactionid | classid | objid | objsubid | virtualtransaction | pid  
> |   mode    | granted | fastpath
>
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------+---------+----------

>
>  transactionid |          |          |      |       | | 4292463 
> |         |       |          | 52/155384 | 25536 | ShareLock | f       
> | f
> (1 row)
>
> postgres=# select * from pg_locks where transactionid=4292463;
>    locktype    | database | relation | page | tuple | virtualxid | 
> transactionid | classid | objid | objsubid | virtualtransaction | pid  
> |     mode      | granted | fastp
> ath
>
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------+---------+------

>
> ----
>  transactionid |          |          |      |       | | 4292463 
> |         |       |          | 52/155384 | 25536 | ShareLock     | 
> f       | f
>  transactionid |          |          |      |       | | 4292463 
> |         |       |          | 56/28816 | 10183 | ExclusiveLock | 
> t       | f
> (2 rows)
>
>
> postgres=# select * from pg_locks where pid=10183;
>    locktype    | database | relation | page | tuple | virtualxid | 
> transactionid | classid | objid | objsubid | virtualtransaction | pid  
> |       mode       | granted | fa
> stpath
>
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+------------------+---------+---

>
> -------
>  relation      |    16401 |    16874 |      |       | |               
> |         |       |          | 56/28816 | 10183 | RowExclusiveLock | 
> t       | t
>  virtualxid    |          |          |      |       | 56/28816 
> |               |         |       |          | 56/28816 | 10183 | 
> ExclusiveLock    | t       | t
>  transactionid |          |          |      |       | | 4292463 
> |         |       |          | 56/28816 | 10183 | ExclusiveLock    | 
> t       | f
>  relation      |    16401 |    16867 |      |       | |               
> |         |       |          | 56/28816 | 10183 | RowExclusiveLock | 
> t       | f
> (4 rows)
>
>
>
> (gdb) bt
> #0  0x00007f71bb99d463 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #1  0x000000000070c81e in WaitEventSetWaitBlock (nevents=1, 
> occurred_events=0x7ffdde17a910, cur_timeout=-1, set=0x22b50e8) at 
> latch.c:1048
> #2  WaitEventSetWait (set=0x22b50e8, timeout=timeout@entry=-1, 
> occurred_events=occurred_events@entry=0x7ffdde17a910, 
> nevents=nevents@entry=1,
>     wait_event_info=wait_event_info@entry=100663296) at latch.c:1000
> #3  0x000000000062d4d3 in secure_read (port=0x22b2ed0, ptr=0xcda640 
> <PqRecvBuffer>, len=8192) at be-secure.c:173
> #4  0x00000000006364b8 in pq_recvbuf () at pqcomm.c:963
> #5  0x0000000000637085 in pq_getbyte () at pqcomm.c:1006
> #6  0x000000000072c4e2 in SocketBackend (inBuf=0x7ffdde17aa80) at 
> postgres.c:341
> #7  ReadCommand (inBuf=0x7ffdde17aa80) at postgres.c:514
> #8  PostgresMain (argc=<optimized out>, argv=argv@entry=0x22b5348, 
> dbname=0x22b5218 "datacollector", username=<optimized out>) at 
> postgres.c:4124
> #9  0x000000000047e313 in BackendRun (port=0x22b2ed0) at 
> postmaster.c:4358
> #10 BackendStartup (port=0x22b2ed0) at postmaster.c:4030
> #11 ServerLoop () at postmaster.c:1707
> #12 0x00000000006c3719 in PostmasterMain (argc=argc@entry=5, 
> argv=argv@entry=0x22842d0) at postmaster.c:1380
> #13 0x000000000047ed8b in main (argc=5, argv=0x22842d0) at main.c:228
>
> Any advice on how to debug this further ?
>
> Regards,
> Tobias
>
-- 
Tobias Gierke
Software Developer

Voipfuture GmbH   Wendenstr. 4   20097 Hamburg   Germany
Phone +49 40 688 9001 64   Fax +49 40 688 9001 99   www.voipfuture.com
Managing Directors   Jan Bastian   Eyal Ullert
Commercial Court AG Hamburg   HRB 109896   VAT ID DE263738086





В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #15833: defining a comment on a domain constraint fails withwrong OID
Следующее
От: Tom Lane
Дата:
Сообщение: Re: inconsistent behaviour of json_to_record and friends with embedded json