Обсуждение: Missing CFI in hlCover()?

Поиск
Список
Период
Сортировка

Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

I'm looking into an issue that we're seeing on the PG archives server
with runaway queries that don't seem to ever want to end- and ignore
signals.

This is PG11, 11.8-1.pgdg100+1 specifically on Debian/buster and what
we're seeing is the loop in hlCover() (wparser_def.c:2071 to 2093) is
lasting an awful long time without any CFI call.  It's possible the CFI
call should actually go elsewhere, but the complete lack of any CFI in
wparser_def.c or tsvector_op.c seems a bit concerning.

I'm suspicious there's something else going on here that's causing this
to take a long time but I don't have any smoking gun as yet.

Thanks,

Stephen

Вложения

Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> I'm looking into an issue that we're seeing on the PG archives server
> with runaway queries that don't seem to ever want to end- and ignore
> signals.

> This is PG11, 11.8-1.pgdg100+1 specifically on Debian/buster and what
> we're seeing is the loop in hlCover() (wparser_def.c:2071 to 2093) is
> lasting an awful long time without any CFI call.  It's possible the CFI
> call should actually go elsewhere, but the complete lack of any CFI in
> wparser_def.c or tsvector_op.c seems a bit concerning.

> I'm suspicious there's something else going on here that's causing this
> to take a long time but I don't have any smoking gun as yet.

Hm.  I'd vote for a CFI within the recursion in TS_execute(), if there's
not one there yet.  Maybe hlFirstIndex needs one too --- if there are
a lot of words in the query, maybe that could be slow?  Did you pin the
blame down any more precisely than hlCover?

            regards, tom lane



Re: Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I'm looking into an issue that we're seeing on the PG archives server
> > with runaway queries that don't seem to ever want to end- and ignore
> > signals.
>
> > This is PG11, 11.8-1.pgdg100+1 specifically on Debian/buster and what
> > we're seeing is the loop in hlCover() (wparser_def.c:2071 to 2093) is
> > lasting an awful long time without any CFI call.  It's possible the CFI
> > call should actually go elsewhere, but the complete lack of any CFI in
> > wparser_def.c or tsvector_op.c seems a bit concerning.
>
> > I'm suspicious there's something else going on here that's causing this
> > to take a long time but I don't have any smoking gun as yet.
>
> Hm.  I'd vote for a CFI within the recursion in TS_execute(), if there's
> not one there yet.  Maybe hlFirstIndex needs one too --- if there are
> a lot of words in the query, maybe that could be slow?  Did you pin the
> blame down any more precisely than hlCover?

I've definitely seen hlFirstIndex take a few seconds to run (while
running this under gdb and stepping through), so that could be a good
choice to place one (perhaps even additionally to this...).  I have to
admit to wondering if we shouldn't consider having one in
check_stack_depth() to try and reduce the risk of us forgetting to have
one in sensible places, though I've not really looked at all the callers
and that might not be reasonable in some cases (though I wonder if maybe
we consider having a 'default' version that has a CFI, and an alternate
that doesn't...).

The depth of recursion for TS_execute_recurse() is actually not all that
bad either though (only 6 or so, as the query string here is:
"ERROR: The required file is not available"), so maybe that isn't really
the right thing to be thinking here.

Down in checkcondition_HL(), checkval->len is 213601, and it seems to
pretty much always end up with a result of TS_NO, but doesn't seem to
take all that long to arrive at that.

Over in hlFirstIndex():

hlFirstIndex (prs=0x7ffc2d4b16c0, prs=0x7ffc2d4b16c0, pos=219518, query=0x559619f81528) at
./build/../src/backend/tsearch/wparser_def.c:2013
2013    hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
(gdb) n
2026                            if (item->type == QI_VAL &&
(gdb)
2029                            item++;
(gdb) p pos
$72 = 219518
(gdb) p prs->curwords
$73 = 583766
(gdb)

Here's a full backtrace down to the checkcondition_HL():

(gdb) i s
#0  checkcondition_HL (opaque=0x7ffc2d4b11f0, val=0x559619f815c0, data=0x0) at
./build/../src/backend/tsearch/wparser_def.c:1981
#1  0x0000559617eced2b in TS_execute_recurse (curitem=0x559619f815c0, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1872
                                                                                          #2  0x0000559617ecedd1 in
TS_execute_recurse(curitem=0x559619f815a8, arg=arg@entry=0x7ffc2d4b11f0, flags=flags@entry=0,
chkcond=chkcond@entry=0x559617df0120<checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1892
#3  0x0000559617ecedd1 in TS_execute_recurse (curitem=0x559619f81590, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>)                                     at
./build/../src/backend/utils/adt/tsvector_op.c:1892
#4  0x0000559617ecedd1 in TS_execute_recurse (curitem=0x559619f81578, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1892
#5  0x0000559617ecedd1 in TS_execute_recurse (curitem=0x559619f81560, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>)                                     at
./build/../src/backend/utils/adt/tsvector_op.c:1892
#6  0x0000559617ecedd1 in TS_execute_recurse (curitem=0x559619f81548, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1892
#7  0x0000559617ecedd1 in TS_execute_recurse (curitem=curitem@entry=0x559619f81530, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1892
#8  0x0000559617ed26d9 in TS_execute (curitem=curitem@entry=0x559619f81530, arg=arg@entry=0x7ffc2d4b11f0,
flags=flags@entry=0,chkcond=chkcond@entry=0x559617df0120 <checkcondition_HL>) 
    at ./build/../src/backend/utils/adt/tsvector_op.c:1854
#9  0x0000559617df041e in hlCover (prs=prs@entry=0x7ffc2d4b16c0, query=query@entry=0x559619f81528,
p=p@entry=0x7ffc2d4b12a0,q=q@entry=0x7ffc2d4b12a4) at ./build/../src/backend/tsearch/wparser_def.c:2075        #10
0x0000559617df1a2din mark_hl_words (max_words=35, min_words=15, shortword=3, highlightall=<optimized out>,
query=<optimizedout>, prs=0x7ffc2d4b16c0) at ./build/../src/backend/tsearch/wparser_def.c:2393 
#11 prsd_headline (fcinfo=<optimized out>) at ./build/../src/backend/tsearch/wparser_def.c:2614
#12 0x0000559617f0cdab in FunctionCall3Coll (flinfo=flinfo@entry=0x559619fe1d90, collation=collation@entry=0,
arg1=arg1@entry=140721068381888,arg2=<optimized out>, arg3=arg3@entry=94103169144104) 
    at ./build/../src/backend/utils/fmgr/fmgr.c:1170
#13 0x0000559617def48b in ts_headline_byid_opt (fcinfo=fcinfo@entry=0x7ffc2d4b1740) at
./build/../src/backend/tsearch/wparser.c:336
#14 0x0000559617f0c484 in DirectFunctionCall4Coll (func=0x559617def380 <ts_headline_byid_opt>, collation=<optimized
out>,arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=<optimized out>) 
    at ./build/../src/backend/utils/fmgr/fmgr.c:877
#15 0x0000559617c8ad41 in ExecInterpExpr (state=0x559619f9b170, econtext=0x559619fd2ae0, isnull=<optimized out>) at
./build/../src/backend/executor/execExprInterp.c:678
#16 0x0000559617cb51da in ExecEvalExprSwitchContext (isNull=0x7ffc2d4b1b97, econtext=0x559619fd2ae0,
state=0x559619f9b170)at ./build/../src/include/executor/executor.h:313 
#17 ExecProject (projInfo=0x559619f9b168) at ./build/../src/include/executor/executor.h:347
#18 ExecResult (pstate=<optimized out>) at ./build/../src/backend/executor/nodeResult.c:136
#19 0x0000559617c955d9 in ExecProcNodeInstr (node=0x559619fd29d0) at ./build/../src/backend/executor/execProcnode.c:461
#20 0x0000559617cadc08 in ExecProcNode (node=0x559619fd29d0) at ./build/../src/include/executor/executor.h:247
#21 ExecLimit (pstate=0x559619fd27e0) at ./build/../src/backend/executor/nodeLimit.c:149
#22 0x0000559617c955d9 in ExecProcNodeInstr (node=0x559619fd27e0) at ./build/../src/backend/executor/execProcnode.c:461
#23 0x0000559617c8e37b in ExecProcNode (node=0x559619fd27e0) at ./build/../src/include/executor/executor.h:247
#24 ExecutePlan (execute_once=<optimized out>, dest=0x559618187460 <donothingDR>, direction=<optimized out>,
numberTuples=0,sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, 
    planstate=0x559619fd27e0, estate=0x559619fd25d0) at ./build/../src/backend/executor/execMain.c:1723
#25 standard_ExecutorRun (queryDesc=0x559619f92af8, direction=<optimized out>, count=0, execute_once=<optimized out>)
at./build/../src/backend/executor/execMain.c:364 
#26 0x00007f607e5e5045 in pgss_ExecutorRun (queryDesc=0x559619f92af8, direction=ForwardScanDirection, count=0,
execute_once=<optimizedout>) at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:892 
#27 0x0000559617c251d6 in ExplainOnePlan (plannedstmt=plannedstmt@entry=0x559619f937e8, into=into@entry=0x0,
es=es@entry=0x559619fda528,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., params=params@entry=0x0, queryEnv=<optimized out>,
planduration=0x7ffc2d4b1e70)at ./build/../src/backend/commands/explain.c:536 
#28 0x0000559617c254ee in ExplainOneQuery (queryEnv=<optimized out>, params=0x0,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., es=0x559619fda528, into=0x0, cursorOptions=<optimized out>, query=<optimized out>) at
./build/../src/backend/commands/explain.c:372
#29 ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x559619fda528,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., params=0x0, queryEnv=0x0) at ./build/../src/backend/commands/explain.c:340
#30 0x0000559617c25a2d in ExplainQuery (pstate=pstate@entry=0x559619f8c0e8, stmt=stmt@entry=0x55961a045650,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., params=params@entry=0x0, queryEnv=queryEnv@entry=0x0,
dest=dest@entry=0x559619f8c058)at ./build/../src/backend/commands/explain.c:255 
#31 0x0000559617deb94b in standard_ProcessUtility (pstmt=pstmt@entry=0x55961a045700,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0,
queryEnv=queryEnv@entry=0x0,dest=dest@entry=0x559619f8c058, 
    completionTag=0x7ffc2d4b22b0 "") at ./build/../src/backend/tcop/utility.c:675
#32 0x00007f607e5e7347 in pgss_ProcessUtility (pstmt=0x55961a045700,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x559619f8c058,
completionTag=0x7ffc2d4b22b0"") at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:1005 
#33 0x0000559617de83d9 in PortalRunUtility (portal=0x559619f33de0, pstmt=0x55961a045700, isTopLevel=<optimized out>,
setHoldSnapshot=<optimizedout>, dest=0x559619f8c058, completionTag=0x7ffc2d4b22b0 "") 
    at ./build/../src/backend/tcop/pquery.c:1178
#34 0x0000559617de91ea in FillPortalStore (portal=0x559619f33de0, isTopLevel=<optimized out>) at
./build/../src/include/nodes/pg_list.h:79
#35 0x0000559617de9dc7 in PortalRun (portal=portal@entry=0x559619f33de0, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x559619fd7c08, 
    altdest=altdest@entry=0x559619fd7c08, completionTag=0x7ffc2d4b24f0 "") at ./build/../src/backend/tcop/pquery.c:768
#36 0x0000559617de5a3e in exec_simple_query (
    query_string=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt 
, plainto_tsquery('pub"...) at ./build/../src/backend/tcop/postgres.c:1145
#37 0x0000559617de72a6 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x559619efa418, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4193 
#38 0x0000559617d709e2 in BackendRun (port=0x559619eee4c0) at ./build/../src/backend/postmaster/postmaster.c:4364
#39 BackendStartup (port=0x559619eee4c0) at ./build/../src/backend/postmaster/postmaster.c:4036
#40 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1707
#41 0x0000559617d71886 in PostmasterMain (argc=5, argv=0x559619e9e930) at
./build/../src/backend/postmaster/postmaster.c:1380
#42 0x0000559617aecdc9 in main (argc=5, argv=0x559619e9e930) at ./build/../src/backend/main/main.c:228

and down to the hlFirstIndex():

#0  hlFirstIndex (prs=0x7ffc2d4b16c0, prs=0x7ffc2d4b16c0, pos=219518, query=0x559619f81528) at
./build/../src/backend/tsearch/wparser_def.c:2029
#1  hlCover (prs=prs@entry=0x7ffc2d4b16c0, query=query@entry=0x559619f81528, p=p@entry=0x7ffc2d4b12a0,
q=q@entry=0x7ffc2d4b12a4)at ./build/../src/backend/tsearch/wparser_def.c:2083 
#2  0x0000559617df1a2d in mark_hl_words (max_words=35, min_words=15, shortword=3, highlightall=<optimized out>,
query=<optimizedout>, prs=0x7ffc2d4b16c0) at ./build/../src/backend/tsearch/wparser_def.c:2393 
#3  prsd_headline (fcinfo=<optimized out>) at ./build/../src/backend/tsearch/wparser_def.c:2614
#4  0x0000559617f0cdab in FunctionCall3Coll (flinfo=flinfo@entry=0x559619fe1d90, collation=collation@entry=0,
arg1=arg1@entry=140721068381888,arg2=<optimized out>, arg3=arg3@entry=94103169144104) 
    at ./build/../src/backend/utils/fmgr/fmgr.c:1170
#5  0x0000559617def48b in ts_headline_byid_opt (fcinfo=fcinfo@entry=0x7ffc2d4b1740) at
./build/../src/backend/tsearch/wparser.c:336
#6  0x0000559617f0c484 in DirectFunctionCall4Coll (func=0x559617def380 <ts_headline_byid_opt>, collation=<optimized
out>,arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=<optimized out>) 
    at ./build/../src/backend/utils/fmgr/fmgr.c:877
#7  0x0000559617c8ad41 in ExecInterpExpr (state=0x559619f9b170, econtext=0x559619fd2ae0, isnull=<optimized out>) at
./build/../src/backend/executor/execExprInterp.c:678
#8  0x0000559617cb51da in ExecEvalExprSwitchContext (isNull=0x7ffc2d4b1b97, econtext=0x559619fd2ae0,
state=0x559619f9b170)at ./build/../src/include/executor/executor.h:313 
#9  ExecProject (projInfo=0x559619f9b168) at ./build/../src/include/executor/executor.h:347
#10 ExecResult (pstate=<optimized out>) at ./build/../src/backend/executor/nodeResult.c:136
#11 0x0000559617c955d9 in ExecProcNodeInstr (node=0x559619fd29d0) at ./build/../src/backend/executor/execProcnode.c:461
#12 0x0000559617cadc08 in ExecProcNode (node=0x559619fd29d0) at ./build/../src/include/executor/executor.h:247
#13 ExecLimit (pstate=0x559619fd27e0) at ./build/../src/backend/executor/nodeLimit.c:149
#14 0x0000559617c955d9 in ExecProcNodeInstr (node=0x559619fd27e0) at ./build/../src/backend/executor/execProcnode.c:461
#15 0x0000559617c8e37b in ExecProcNode (node=0x559619fd27e0) at ./build/../src/include/executor/executor.h:247
#16 ExecutePlan (execute_once=<optimized out>, dest=0x559618187460 <donothingDR>, direction=<optimized out>,
numberTuples=0,sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, 
    planstate=0x559619fd27e0, estate=0x559619fd25d0) at ./build/../src/backend/executor/execMain.c:1723
#17 standard_ExecutorRun (queryDesc=0x559619f92af8, direction=<optimized out>, count=0, execute_once=<optimized out>)
at./build/../src/backend/executor/execMain.c:364 
#18 0x00007f607e5e5045 in pgss_ExecutorRun (queryDesc=0x559619f92af8, direction=ForwardScanDirection, count=0,
execute_once=<optimizedout>) at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:892 
#19 0x0000559617c251d6 in ExplainOnePlan (plannedstmt=plannedstmt@entry=0x559619f937e8, into=into@entry=0x0,
es=es@entry=0x559619fda528,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., params=params@entry=0x0, queryEnv=<optimized out>,
planduration=0x7ffc2d4b1e70)at ./build/../src/backend/commands/explain.c:536 
#20 0x0000559617c254ee in ExplainOneQuery (queryEnv=<optimized out>, params=0x0,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., es=0x559619fda528, into=0x0, cursorOptions=<optimized out>, query=<optimized out>) at
./build/../src/backend/commands/explain.c:372
#21 ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x559619fda528,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., params=0x0, queryEnv=0x0) at ./build/../src/backend/commands/explain.c:340
#22 0x0000559617c25a2d in ExplainQuery (pstate=pstate@entry=0x559619f8c0e8, stmt=stmt@entry=0x55961a045650,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., params=params@entry=0x0, queryEnv=queryEnv@entry=0x0,
dest=dest@entry=0x559619f8c058)at ./build/../src/backend/commands/explain.c:255 
#23 0x0000559617deb94b in standard_ProcessUtility (pstmt=pstmt@entry=0x55961a045700,
    queryString=queryString@entry=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from,
ts_rank_cd(fti,plainto_tsquery('public.pg', 'ERROR: The required file is not available')), ts 
_headline(bodytxt, plainto_tsquery('pub"..., context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0,
queryEnv=queryEnv@entry=0x0,dest=dest@entry=0x559619f8c058, 
    completionTag=0x7ffc2d4b22b0 "") at ./build/../src/backend/tcop/utility.c:675
#24 0x00007f607e5e7347 in pgss_ProcessUtility (pstmt=0x55961a045700,
    queryString=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt, 
 plainto_tsquery('pub"..., context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x559619f8c058,
completionTag=0x7ffc2d4b22b0"") at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:1005 
#25 0x0000559617de83d9 in PortalRunUtility (portal=0x559619f33de0, pstmt=0x55961a045700, isTopLevel=<optimized out>,
setHoldSnapshot=<optimizedout>, dest=0x559619f8c058, completionTag=0x7ffc2d4b22b0 "") 
    at ./build/../src/backend/tcop/pquery.c:1178
#26 0x0000559617de91ea in FillPortalStore (portal=0x559619f33de0, isTopLevel=<optimized out>) at
./build/../src/include/nodes/pg_list.h:79
#27 0x0000559617de9dc7 in PortalRun (portal=portal@entry=0x559619f33de0, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x559619fd7c08, 
    altdest=altdest@entry=0x559619fd7c08, completionTag=0x7ffc2d4b24f0 "") at ./build/../src/backend/tcop/pquery.c:768
#28 0x0000559617de5a3e in exec_simple_query (
    query_string=0x559619ea3e80 "explain (analyze, buffers) SELECT messageid, date, subject, _from, ts_rank_cd(fti,
plainto_tsquery('public.pg','ERROR: The required file is not available')), ts_headline(bodytxt 
, plainto_tsquery('pub"...) at ./build/../src/backend/tcop/postgres.c:1145
#29 0x0000559617de72a6 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x559619efa418, dbname=<optimized out>,
username=<optimizedout>) at ./build/../src/backend/tcop/postgres.c:4193 
#30 0x0000559617d709e2 in BackendRun (port=0x559619eee4c0) at ./build/../src/backend/postmaster/postmaster.c:4364
#31 BackendStartup (port=0x559619eee4c0) at ./build/../src/backend/postmaster/postmaster.c:4036
#32 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1707
#33 0x0000559617d71886 in PostmasterMain (argc=5, argv=0x559619e9e930) at
./build/../src/backend/postmaster/postmaster.c:1380
#34 0x0000559617aecdc9 in main (argc=5, argv=0x559619e9e930) at ./build/../src/backend/main/main.c:228

Thanks!

Stephen

Вложения

Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Hm.  I'd vote for a CFI within the recursion in TS_execute(), if there's
>> not one there yet.  Maybe hlFirstIndex needs one too --- if there are
>> a lot of words in the query, maybe that could be slow?  Did you pin the
>> blame down any more precisely than hlCover?

> I've definitely seen hlFirstIndex take a few seconds to run (while
> running this under gdb and stepping through), so that could be a good
> choice to place one (perhaps even additionally to this...).  I have to
> admit to wondering if we shouldn't consider having one in
> check_stack_depth() to try and reduce the risk of us forgetting to have
> one in sensible places, though I've not really looked at all the callers
> and that might not be reasonable in some cases (though I wonder if maybe
> we consider having a 'default' version that has a CFI, and an alternate
> that doesn't...).

Adding it to check_stack_depth doesn't really seem like a reasonable
proposal to me; aside from failing to separate concerns, running a
long time is quite distinct from taking a lot of stack.

The reason I'm eyeing TS_execute is that it involves callbacks to
functions that might be pretty complex in themselves, eg during index
scans.  So that would guard a lot of territory besides hlCover.  But
hlFirstIndex could use a CFI too, if you've seen it take that long.
(I wonder if we need to try to make it faster.  I'd supposed that the
loop was cheap enough to be a non-problem, but with large enough
documents maybe not?  It seems like converting to a hash table could
be worthwhile for a large doc.)

            regards, tom lane



Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
I tried to duplicate a multiple-second ts_headline call here, and
failed to, so there must be something I'm missing.  Can you provide
a concrete example?  I'd like to do some analysis with perf.

            regards, tom lane



Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
I wrote:
> (I wonder if we need to try to make it faster.  I'd supposed that the
> loop was cheap enough to be a non-problem, but with large enough
> documents maybe not?  It seems like converting to a hash table could
> be worthwhile for a large doc.)

OK, I dug into Stephen's test case off-list.  While some CFIs would
be a good idea, that's just band-aid'ing the symptom.  The actual
problem is that hlCover() is taking way too much time.  The test case
boils down to "common_word & rare_word" matched to a very long document,
wherein the rare_word appears only near the front.  Once we're past
that match, hlCover() tries all the remaining matches for common_word,
of which there are plenty ... and for each one, it scans clear to the
end of the document, looking vainly for a substring that will satisfy
the "common_word & rare_word" query.  So obviously, this is O(N^2)
in the length of the document :-(.

I have to suppose that I introduced this problem in c9b0c678d, since
AFAIR we weren't getting ts_headline-takes-forever complaints before
that.  It's not immediately obvious why the preceding algorithm doesn't
have a similar issue, but then there's not anything at all that was
obvious about the preceding algorithm.

The most obvious way to fix the O(N^2) hazard is to put a limit on the
length of "cover" (matching substring) that we'll consider.  For the
mark_hl_words caller, we won't highlight more than max_words words
anyway, so it would be reasonable to bound covers to that length or
some small multiple of it.  The other caller mark_hl_fragments is
willing to highlight up to max_fragments of up to max_words each, and
there can be some daylight between the fragments, so it's not quite
clear what the longest reasonable match is.  Still, I doubt it's
useful to show a headline consisting of a few words from the start of
the document and a few words from thousands of words later, so a limit
of max_fragments times max_words times something would probably be
reasonable.

We could hard-code a rule like that, or we could introduce a new
explicit parameter for the maximum cover length.  The latter would be
more flexible, but we need something back-patchable and I'm concerned
about the compatibility hazards of adding a new parameter in minor
releases.  So on the whole I propose hard-wiring a multiplier of,
say, 10 for both these cases.

            regards, tom lane



Re: Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> We could hard-code a rule like that, or we could introduce a new
> explicit parameter for the maximum cover length.  The latter would be
> more flexible, but we need something back-patchable and I'm concerned
> about the compatibility hazards of adding a new parameter in minor
> releases.  So on the whole I propose hard-wiring a multiplier of,
> say, 10 for both these cases.

That sounds alright to me, though I do think we should probably still
toss a CFI (or two) in this path somewhere as we don't know how long
some of these functions might take...

Thanks,

Stephen

Вложения

Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> We could hard-code a rule like that, or we could introduce a new
>> explicit parameter for the maximum cover length.  The latter would be
>> more flexible, but we need something back-patchable and I'm concerned
>> about the compatibility hazards of adding a new parameter in minor
>> releases.  So on the whole I propose hard-wiring a multiplier of,
>> say, 10 for both these cases.

> That sounds alright to me, though I do think we should probably still
> toss a CFI (or two) in this path somewhere as we don't know how long
> some of these functions might take...

Yeah, of course.  I'm still leaning to doing that in TS_execute_recurse.

            regards, tom lane



Re: Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> We could hard-code a rule like that, or we could introduce a new
> >> explicit parameter for the maximum cover length.  The latter would be
> >> more flexible, but we need something back-patchable and I'm concerned
> >> about the compatibility hazards of adding a new parameter in minor
> >> releases.  So on the whole I propose hard-wiring a multiplier of,
> >> say, 10 for both these cases.
>
> > That sounds alright to me, though I do think we should probably still
> > toss a CFI (or two) in this path somewhere as we don't know how long
> > some of these functions might take...
>
> Yeah, of course.  I'm still leaning to doing that in TS_execute_recurse.

Works for me.

Thanks!

Stephen

Вложения

Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Stephen Frost <sfrost@snowman.net> writes:
>>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>>> We could hard-code a rule like that, or we could introduce a new
>>>> explicit parameter for the maximum cover length.  The latter would be
>>>> more flexible, but we need something back-patchable and I'm concerned
>>>> about the compatibility hazards of adding a new parameter in minor
>>>> releases.  So on the whole I propose hard-wiring a multiplier of,
>>>> say, 10 for both these cases.

>>> That sounds alright to me, though I do think we should probably still
>>> toss a CFI (or two) in this path somewhere as we don't know how long
>>> some of these functions might take...

>> Yeah, of course.  I'm still leaning to doing that in TS_execute_recurse.

> Works for me.

Here's a proposed patch along that line.

            regards, tom lane

diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index 76b6f9aef0..1df1c0362d 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2028,8 +2028,10 @@ hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
 /*
  * hlCover: try to find a substring of prs' word list that satisfies query
  *
- * At entry, *p must be the first word index to consider (initialize this to
- * zero, or to the next index after a previous successful search).
+ * At entry, *p must be the first word index to consider (initialize this
+ * to zero, or to the next index after a previous successful search).
+ * We will consider all substrings starting at or after that word, and
+ * containing no more than max_cover words.
  *
  * On success, sets *p to first word index and *q to last word index of the
  * cover substring, and returns true.
@@ -2038,7 +2040,8 @@ hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
  * words used in the query.
  */
 static bool
-hlCover(HeadlineParsedText *prs, TSQuery query, int *p, int *q)
+hlCover(HeadlineParsedText *prs, TSQuery query, int max_cover,
+        int *p, int *q)
 {
     int            pmin,
                 pmax,
@@ -2084,7 +2087,7 @@ hlCover(HeadlineParsedText *prs, TSQuery query, int *p, int *q)
                 nextpmin = nextpmax;
             pmax = nextpmax;
         }
-        while (pmax >= 0);
+        while (pmax >= 0 && pmax - pmin < max_cover);
         /* No luck here, so try next feasible startpoint */
         pmin = nextpmin;
     }
@@ -2186,7 +2189,7 @@ get_next_fragment(HeadlineParsedText *prs, int *startpos, int *endpos,
 static void
 mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
                   int shortword, int min_words,
-                  int max_words, int max_fragments)
+                  int max_words, int max_fragments, int max_cover)
 {
     int32        poslen,
                 curlen,
@@ -2213,7 +2216,7 @@ mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
     covers = palloc(maxcovers * sizeof(CoverPos));

     /* get all covers */
-    while (hlCover(prs, query, &p, &q))
+    while (hlCover(prs, query, max_cover, &p, &q))
     {
         startpos = p;
         endpos = q;
@@ -2368,7 +2371,7 @@ mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
  */
 static void
 mark_hl_words(HeadlineParsedText *prs, TSQuery query, bool highlightall,
-              int shortword, int min_words, int max_words)
+              int shortword, int min_words, int max_words, int max_cover)
 {
     int            p = 0,
                 q = 0;
@@ -2386,7 +2389,7 @@ mark_hl_words(HeadlineParsedText *prs, TSQuery query, bool highlightall,
     if (!highlightall)
     {
         /* examine all covers, select a headline using the best one */
-        while (hlCover(prs, query, &p, &q))
+        while (hlCover(prs, query, max_cover, &p, &q))
         {
             /*
              * Count words (curlen) and interesting words (poslen) within
@@ -2542,6 +2545,7 @@ prsd_headline(PG_FUNCTION_ARGS)
     int            shortword = 3;
     int            max_fragments = 0;
     bool        highlightall = false;
+    int            max_cover;
     ListCell   *l;

     /* Extract configuration option values */
@@ -2581,6 +2585,15 @@ prsd_headline(PG_FUNCTION_ARGS)
                             defel->defname)));
     }

+    /*
+     * We might eventually make max_cover a user-settable parameter, but for
+     * now, just compute a reasonable value based on max_words and
+     * max_fragments.
+     */
+    max_cover = Max(max_words * 10, 100);
+    if (max_fragments > 0)
+        max_cover *= max_fragments;
+
     /* in HighlightAll mode these parameters are ignored */
     if (!highlightall)
     {
@@ -2605,10 +2618,10 @@ prsd_headline(PG_FUNCTION_ARGS)
     /* Apply appropriate headline selector */
     if (max_fragments == 0)
         mark_hl_words(prs, query, highlightall, shortword,
-                      min_words, max_words);
+                      min_words, max_words, max_cover);
     else
         mark_hl_fragments(prs, query, highlightall, shortword,
-                          min_words, max_words, max_fragments);
+                          min_words, max_words, max_fragments, max_cover);

     /* Fill in default values for string options */
     if (!prs->startsel)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index f01b1ee253..756a48a167 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -1868,6 +1868,9 @@ TS_execute_recurse(QueryItem *curitem, void *arg, uint32 flags,
     /* since this function recurses, it could be driven to stack overflow */
     check_stack_depth();

+    /* ... and let's check for query cancel while we're at it */
+    CHECK_FOR_INTERRUPTS();
+
     if (curitem->type == QI_VAL)
         return chkcond(arg, (QueryOperand *) curitem,
                        NULL /* don't need position info */ );

Re: Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> Stephen Frost <sfrost@snowman.net> writes:
> >>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >>>> We could hard-code a rule like that, or we could introduce a new
> >>>> explicit parameter for the maximum cover length.  The latter would be
> >>>> more flexible, but we need something back-patchable and I'm concerned
> >>>> about the compatibility hazards of adding a new parameter in minor
> >>>> releases.  So on the whole I propose hard-wiring a multiplier of,
> >>>> say, 10 for both these cases.
>
> >>> That sounds alright to me, though I do think we should probably still
> >>> toss a CFI (or two) in this path somewhere as we don't know how long
> >>> some of these functions might take...
>
> >> Yeah, of course.  I'm still leaning to doing that in TS_execute_recurse.
>
> > Works for me.
>
> Here's a proposed patch along that line.

I've back-patched this to 11 (which was just a bit of fuzz) and tested
it out with a couple of different queries that were causing issues
previously on the archive server, and they finish in a much more
reasonable time and react faster to cancel requests/signals.

If you'd like to play with it more, the PG11 installed on ark2 now has
this patch built into it.

So, looks good to me, and would certainly be nice to get this into the
next set of releases, so the archive server doesn't get stuck anymore.
:)

Thanks!

Stephen

Вложения

Re: Missing CFI in hlCover()?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Here's a proposed patch along that line.

> I've back-patched this to 11 (which was just a bit of fuzz) and tested
> it out with a couple of different queries that were causing issues
> previously on the archive server, and they finish in a much more
> reasonable time and react faster to cancel requests/signals.

Yeah, I'd tried this locally using the data from the one test case you
showed me, and it seemed to fix that.

> So, looks good to me, and would certainly be nice to get this into the
> next set of releases, so the archive server doesn't get stuck anymore.

I'll push this tomorrow if nobody has objected to it.

BTW, I had noticed last night that hlFirstIndex is being unreasonably
stupid.  Many of the "words" have null item pointers and hence can't
possibly match any query item (I think because we have "words" for
inter-word spaces/punctuation as well as the actual words).  Checking
that, as in the attached v2 patch, makes things a bit faster yet.

            regards, tom lane

diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index 76b6f9aef0..92517e4c17 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2011,13 +2011,18 @@ hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
     for (i = pos; i < prs->curwords; i++)
     {
         /* ... scan the query to see if this word matches any operand */
-        QueryItem  *item = GETQUERY(query);
+        QueryOperand *worditem = prs->words[i].item;
+        QueryItem  *item;
         int            j;

+        if (worditem == NULL)
+            continue;            /* certainly not a match */
+
+        item = GETQUERY(query);
         for (j = 0; j < query->size; j++)
         {
             if (item->type == QI_VAL &&
-                prs->words[i].item == &item->qoperand)
+                worditem == &item->qoperand)
                 return i;
             item++;
         }
@@ -2028,8 +2033,14 @@ hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
 /*
  * hlCover: try to find a substring of prs' word list that satisfies query
  *
- * At entry, *p must be the first word index to consider (initialize this to
- * zero, or to the next index after a previous successful search).
+ * At entry, *p must be the first word index to consider (initialize this
+ * to zero, or to the next index after a previous successful search).
+ * We will consider all substrings starting at or after that word, and
+ * containing no more than max_cover words.  (We need a length limit to
+ * keep this from taking O(N^2) time for a long document with many query
+ * words but few complete matches.  Actually, since checkcondition_HL is
+ * roughly O(N) in the length of the substring being checked, it's even
+ * worse than that.)
  *
  * On success, sets *p to first word index and *q to last word index of the
  * cover substring, and returns true.
@@ -2038,7 +2049,8 @@ hlFirstIndex(HeadlineParsedText *prs, TSQuery query, int pos)
  * words used in the query.
  */
 static bool
-hlCover(HeadlineParsedText *prs, TSQuery query, int *p, int *q)
+hlCover(HeadlineParsedText *prs, TSQuery query, int max_cover,
+        int *p, int *q)
 {
     int            pmin,
                 pmax,
@@ -2084,7 +2096,7 @@ hlCover(HeadlineParsedText *prs, TSQuery query, int *p, int *q)
                 nextpmin = nextpmax;
             pmax = nextpmax;
         }
-        while (pmax >= 0);
+        while (pmax >= 0 && pmax - pmin < max_cover);
         /* No luck here, so try next feasible startpoint */
         pmin = nextpmin;
     }
@@ -2186,7 +2198,7 @@ get_next_fragment(HeadlineParsedText *prs, int *startpos, int *endpos,
 static void
 mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
                   int shortword, int min_words,
-                  int max_words, int max_fragments)
+                  int max_words, int max_fragments, int max_cover)
 {
     int32        poslen,
                 curlen,
@@ -2213,7 +2225,7 @@ mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
     covers = palloc(maxcovers * sizeof(CoverPos));

     /* get all covers */
-    while (hlCover(prs, query, &p, &q))
+    while (hlCover(prs, query, max_cover, &p, &q))
     {
         startpos = p;
         endpos = q;
@@ -2368,7 +2380,7 @@ mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
  */
 static void
 mark_hl_words(HeadlineParsedText *prs, TSQuery query, bool highlightall,
-              int shortword, int min_words, int max_words)
+              int shortword, int min_words, int max_words, int max_cover)
 {
     int            p = 0,
                 q = 0;
@@ -2386,7 +2398,7 @@ mark_hl_words(HeadlineParsedText *prs, TSQuery query, bool highlightall,
     if (!highlightall)
     {
         /* examine all covers, select a headline using the best one */
-        while (hlCover(prs, query, &p, &q))
+        while (hlCover(prs, query, max_cover, &p, &q))
         {
             /*
              * Count words (curlen) and interesting words (poslen) within
@@ -2542,6 +2554,7 @@ prsd_headline(PG_FUNCTION_ARGS)
     int            shortword = 3;
     int            max_fragments = 0;
     bool        highlightall = false;
+    int            max_cover;
     ListCell   *l;

     /* Extract configuration option values */
@@ -2581,6 +2594,15 @@ prsd_headline(PG_FUNCTION_ARGS)
                             defel->defname)));
     }

+    /*
+     * We might eventually make max_cover a user-settable parameter, but for
+     * now, just compute a reasonable value based on max_words and
+     * max_fragments.
+     */
+    max_cover = Max(max_words * 10, 100);
+    if (max_fragments > 0)
+        max_cover *= max_fragments;
+
     /* in HighlightAll mode these parameters are ignored */
     if (!highlightall)
     {
@@ -2605,10 +2627,10 @@ prsd_headline(PG_FUNCTION_ARGS)
     /* Apply appropriate headline selector */
     if (max_fragments == 0)
         mark_hl_words(prs, query, highlightall, shortword,
-                      min_words, max_words);
+                      min_words, max_words, max_cover);
     else
         mark_hl_fragments(prs, query, highlightall, shortword,
-                          min_words, max_words, max_fragments);
+                          min_words, max_words, max_fragments, max_cover);

     /* Fill in default values for string options */
     if (!prs->startsel)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index f01b1ee253..756a48a167 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -1868,6 +1868,9 @@ TS_execute_recurse(QueryItem *curitem, void *arg, uint32 flags,
     /* since this function recurses, it could be driven to stack overflow */
     check_stack_depth();

+    /* ... and let's check for query cancel while we're at it */
+    CHECK_FOR_INTERRUPTS();
+
     if (curitem->type == QI_VAL)
         return chkcond(arg, (QueryOperand *) curitem,
                        NULL /* don't need position info */ );

Re: Missing CFI in hlCover()?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> Here's a proposed patch along that line.
>
> > I've back-patched this to 11 (which was just a bit of fuzz) and tested
> > it out with a couple of different queries that were causing issues
> > previously on the archive server, and they finish in a much more
> > reasonable time and react faster to cancel requests/signals.
>
> Yeah, I'd tried this locally using the data from the one test case you
> showed me, and it seemed to fix that.

Good stuff.

> > So, looks good to me, and would certainly be nice to get this into the
> > next set of releases, so the archive server doesn't get stuck anymore.
>
> I'll push this tomorrow if nobody has objected to it.

Sounds good.

> BTW, I had noticed last night that hlFirstIndex is being unreasonably
> stupid.  Many of the "words" have null item pointers and hence can't
> possibly match any query item (I think because we have "words" for
> inter-word spaces/punctuation as well as the actual words).  Checking
> that, as in the attached v2 patch, makes things a bit faster yet.

Nice, looks good to me.

Thanks!

Stephen

Вложения