Re: RFC: Logging plan of the running query
От | torikoshia |
---|---|
Тема | Re: RFC: Logging plan of the running query |
Дата | |
Msg-id | c8b2b9f005e3d7ed51f6d70abfd641c4@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: RFC: Logging plan of the running query (torikoshia <torikoshia@oss.nttdata.com>) |
Ответы |
Re: RFC: Logging plan of the running query
(Ekaterina Sokolova <e.sokolova@postgrespro.ru>)
|
Список | pgsql-hackers |
On 2021-09-07 12:39, torikoshia wrote: > On 2021-08-20 01:12, Fujii Masao wrote: >> On 2021/08/11 21:14, torikoshia wrote: >>> As far as I looked into, pg_log_current_plan() can call >>> InstrEndLoop() through ExplainNode(). >>> I added a flag to ExplainState to avoid calling InstrEndLoop() when >>> ExplainNode() is called from pg_log_current_plan(). >> >> Thanks for updating the patch! >> I tried to test the patch again and encountered two issues. > > Thanks for finding these issues! > >> >> (1) >> The following WITH RECURSIVE query failed with the error >> "ERROR: failed to find plan for CTE sg" when I ran >> pg_log_current_query_plan() against the backend executing that query. >> Is this a bug? >> >> create table graph0( f int, t int, label text ); >> insert into graph0 values (1, 2, 'arc 1 -> 2'),(1, 3, 'arc 1 -> >> 3'),(2, 3, 'arc 2 -> 3'),(1, 4, 'arc 1 -> 4'),(4, 5, 'arc 4 -> 5'); >> >> with recursive search_graph(f, t, label, i) as ( >> select *, 1||pg_sleep(1)::text from graph0 g >> union distinct >> select g.*,1||pg_sleep(1)::text >> from graph0 g, search_graph sg >> where g.f = sg.t >> ) search breadth first by f, t set seq >> select * from search_graph order by seq; > > This ERROR occurred without applying the patch and just calling > EXPLAIN(VERBOSE) to CTE with SEARCH BREADTH FIRST. > > I'm going to make another thread to discuss it. > >> (2) >> When I ran pg_log_current_query_plan() while "make installcheck" test >> was running, I got the following assertion failure. >> >> TRAP: FailedAssertion("!IsPageLockHeld || (locktag->locktag_type == >> LOCKTAG_RELATION_EXTEND)", File: "lock.c", Line: 894, PID: 61512) >> >> 0 postgres 0x000000010ec23557 >> ExceptionalCondition + 231 >> 1 postgres 0x000000010e9eff15 >> LockAcquireExtended + 1461 >> 2 postgres 0x000000010e9ed14d >> LockRelationOid + 61 >> 3 postgres 0x000000010e41251b >> relation_open + 91 >> 4 postgres 0x000000010e509679 table_open >> + 25 >> 5 postgres 0x000000010ebf9462 >> SearchCatCacheMiss + 274 >> 6 postgres 0x000000010ebf5979 >> SearchCatCacheInternal + 761 >> 7 postgres 0x000000010ebf566c >> SearchCatCache + 60 >> 8 postgres 0x000000010ec1a9e0 >> SearchSysCache + 144 >> 9 postgres 0x000000010ec1ae03 >> SearchSysCacheExists + 51 >> 10 postgres 0x000000010e58ce35 >> TypeIsVisible + 437 >> 11 postgres 0x000000010ea98e4c >> format_type_extended + 1964 >> 12 postgres 0x000000010ea9900e >> format_type_with_typemod + 30 >> 13 postgres 0x000000010eb78d76 >> get_const_expr + 742 >> 14 postgres 0x000000010eb79bc8 >> get_rule_expr + 232 >> 15 postgres 0x000000010eb8140f >> get_func_expr + 1247 >> 16 postgres 0x000000010eb79dcd >> get_rule_expr + 749 >> 17 postgres 0x000000010eb81688 >> get_rule_expr_paren + 136 >> 18 postgres 0x000000010eb7bf38 >> get_rule_expr + 9304 >> 19 postgres 0x000000010eb72ad5 >> deparse_expression_pretty + 149 >> 20 postgres 0x000000010eb73463 >> deparse_expression + 83 >> 21 postgres 0x000000010e68eaf1 >> show_plan_tlist + 353 >> 22 postgres 0x000000010e68adaf ExplainNode >> + 4991 >> 23 postgres 0x000000010e688b4b >> ExplainPrintPlan + 283 >> 24 postgres 0x000000010e68e1aa >> ProcessLogCurrentPlanInterrupt + 266 >> 25 postgres 0x000000010ea133bb >> ProcessInterrupts + 3435 >> 26 postgres 0x000000010e738c97 >> vacuum_delay_point + 55 >> 27 postgres 0x000000010e42bb4b >> ginInsertCleanup + 1531 >> 28 postgres 0x000000010e42d418 >> gin_clean_pending_list + 776 >> 29 postgres 0x000000010e74955a >> ExecInterpExpr + 2522 >> 30 postgres 0x000000010e7487e2 >> ExecInterpExprStillValid + 82 >> 31 postgres 0x000000010e7ae83b >> ExecEvalExprSwitchContext + 59 >> 32 postgres 0x000000010e7ae7be ExecProject >> + 78 >> 33 postgres 0x000000010e7ae4e9 ExecResult >> + 345 >> 34 postgres 0x000000010e764e42 >> ExecProcNodeFirst + 82 >> 35 postgres 0x000000010e75ccb2 >> ExecProcNode + 50 >> 36 postgres 0x000000010e758301 ExecutePlan >> + 193 >> 37 postgres 0x000000010e7581d1 >> standard_ExecutorRun + 609 >> 38 auto_explain.so 0x000000010f1df3a7 >> explain_ExecutorRun + 247 >> 39 postgres 0x000000010e757f3b ExecutorRun >> + 91 >> 40 postgres 0x000000010ea1cb49 >> PortalRunSelect + 313 >> 41 postgres 0x000000010ea1c4dd PortalRun + >> 861 >> 42 postgres 0x000000010ea17474 >> exec_simple_query + 1540 >> 43 postgres 0x000000010ea164d4 >> PostgresMain + 2580 >> 44 postgres 0x000000010e91d159 BackendRun >> + 89 >> 45 postgres 0x000000010e91c6a5 >> BackendStartup + 565 >> 46 postgres 0x000000010e91b3fe ServerLoop >> + 638 >> 47 postgres 0x000000010e918b9d >> PostmasterMain + 6717 >> 48 postgres 0x000000010e7efd43 main + 819 >> 49 libdyld.dylib 0x00007fff6a46e3d5 start + 1 >> 50 ??? 0x0000000000000003 0x0 + 3 >> >> LOG: server process (PID 61512) was terminated by signal 6: Abort >> trap: 6 >> DETAIL: Failed process was running: select >> gin_clean_pending_list('t_gin_test_tbl_i_j_idx') is not null; > > As far as I understand, since explaining plans can acquire heavyweight > lock for example to get column names, when page lock is held at the > time of the interrupt, this assertion error occurs. > > The attached patch tries to avoid this by checking each LocalLock > entry and when finding even one, giving up logging the plan. > > Thoughts? Regression tests failed on windows. Updated patch attached. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Вложения
В списке pgsql-hackers по дате отправления: