Обсуждение: GEQO Triggers Server Crash

Поиск
Список
Период
Сортировка

GEQO Triggers Server Crash

От
Kris Jurka
Дата:
The following script crashes 7.4devel.

SET geqo_threshold=2;
CREATE TABLE t1 (a int primary key);
CREATE TABLE t2 (a int);
SELECT * FROM t1,t2 WHERE t1.a=t2.a;

#0  0x0811001c in best_inner_indexscan (root=<incomplete type>,
rel=<error type>,
    outer_relids=<error type>, jointype=<incomplete type>) at
indxpath.c:1421
1421                    if (sameseti(info->other_relids, outer_relids)
&&
(gdb) bt
#0  0x0811001c in best_inner_indexscan (root=<incomplete type>,
rel=<error type>,
    outer_relids=<error type>, jointype=<incomplete type>) at
indxpath.c:1421
#1  0x08111321 in match_unsorted_outer (root=<incomplete type>,
    joinrel=<incomplete type>, outerrel=<incomplete type>,
    innerrel=<incomplete type>, restrictlist=<incomplete type>,
    mergeclause_list=<incomplete type>, jointype=
      {lockHash = 0x0, holderHash = 0x81911f5, lockmethod = 0,
numLockModes = 137172908, conflictTab = 137112324, masterLock =
BufMgrLock}) at joinpath.c:367
#2  0x081110c3 in add_paths_to_joinrel (root=<incomplete type>,
    joinrel=<incomplete type>, outerrel=<incomplete type>,
    innerrel=<incomplete type>, jointype=
      {lockHash = 0x0, holderHash = 0x82d1978, lockmethod = 1,
numLockModes = 28, conflictTab = 137114248, masterLock = 137114248},
restrictlist=<incomplete type>)
    at joinpath.c:98
#3  0x08111f81 in make_join_rel (root=<error type>, rel1=<error type>,
    rel2=<error type>, jointype=<error type>) at joinrels.c:434
#4  0x08111d69 in make_rels_by_clause_joins (root=<error type>,
    old_rel=<error type>, other_rels=<error type>) at joinrels.c:271
#5  0x0810b0e4 in gimme_tree (root=<error type>, initial_rels=<error
type>,
    tour=<error type>, num_gene=2, rel_count=0, old_rel=<error type>)
    at geqo_eval.c:129
#6  0x0810b05d in geqo_eval (root=<error type>, initial_rels=<error
type>,
    tour=<error type>, num_gene=2) at geqo_eval.c:67
#7  0x0810b757 in random_init_pool (root=<error type>,
initial_rels=<error type>,
    pool=0x82c3c7c, strt=0, stp=128) at geqo_pool.c:96
#8  0x0810b22a in geqo (root=<error type>, number_of_rels=2,
    initial_rels=<error type>) at geqo_main.c:110
#9  0x0810d1a5 in make_fromexpr_rel (root=<incomplete type>,
from=0x82c70b4)
    at allpaths.c:426
#10 0x0810cce5 in make_one_rel (root=<incomplete type>) at allpaths.c:76
#11 0x08115de8 in query_planner (root=<incomplete type>,
tlist=<incomplete type>,
    tuple_fraction=0, cheapest_path=0xbfffec58, sorted_path=0xbfffec5c)
    at planmain.c:155
#12 0x08116dff in grouping_planner (parse=<incomplete type>,
tuple_fraction=0)
    at planner.c:1233
#13 0x08116123 in subquery_planner (parse=<incomplete type>,
tuple_fraction=-1)
    at planner.c:263
#14 0x08115f41 in planner (parse=<incomplete type>) at planner.c:106
#15 0x080cba3f in ExplainOneQuery (query=<incomplete type>,
stmt=0x82bdfd0,
    tstate=0x82c2ae8) at explain.c:133
#16 0x080cb992 in ExplainQuery (stmt=0x82bdfd0, dest=<incomplete type>)
    at explain.c:100
#17 0x0813e4db in ProcessUtility (parsetree=<incomplete type>,
    dest=<incomplete type>, completionTag=0xbfffee5c "") at
utility.c:713
#18 0x0813bd34 in pg_exec_query_string (query_string=0x82bda04,
    dest=<error type>, parse_context=<incomplete type>) at
postgres.c:789
#19 0x0813cf70 in PostgresMain (argc=4, argv=<incomplete type>,
    username=<incomplete type>) at postgres.c:2016
#20 0x08121d4a in DoBackend (port=
      {receiveTuple = 0x8283510, setup = 0x8272d38, cleanup =
0xbffff2ec})
    at postmaster.c:2293
#21 0x0812163f in BackendStartup (port=
      {receiveTuple = 0x8283510, setup = 0xbffff344, cleanup = 0})
    at postmaster.c:1915
#22 0x08120829 in ServerLoop () at postmaster.c:1002
#23 0x0812037a in PostmasterMain (argc=3, argv=<incomplete type>)
    at postmaster.c:781
#24 0x080fac1f in main (argc=3, argv=<incomplete type>) at main.c:209

Re: GEQO Triggers Server Crash

От
Tom Lane
Дата:
Kris Jurka <jurka@ejurka.com> writes:
> [ GEQO doesn't work anymore in CVS tip ]

Ugh.  The proximate cause of this is the code I added recently to cache
repeated calculations of the best inner indexscan for a given inner
relation with potential outer relations.  Since geqo_eval() releases
all memory acquired during construction of a possible jointree, it
releases the cached path info too.  The next attempt to use the data
fails.

Naturally, ripping out the cache again doesn't strike me as an appealing
solution.

The narrowest fix would be to hack best_inner_indexscan() to switch into
the context containing the parent RelOptInfo while it makes a cache
entry.  This seems kinda klugy but it would work.

I wonder if we'd be better off not trying to reclaim memory in
geqo_eval.  Aside from presenting a constant risk of this sort of
problem whenever someone hacks the optimizer, what it's really doing
is discarding a whole lot of join cost estimates that are likely to
be done over again in (some of) the following calls of geqo_eval.
GEQO would certainly be a lot faster if we didn't release that info,
and I'm not sure that the space cost would be as bad as the code
comments claim.  Any thoughts?

This really just points up how messy memory management in the optimizer
is at present.  I wonder if anyone has ideas on improving it ...
        regards, tom lane


Re: [HACKERS] GEQO Triggers Server Crash

От
Bruce Momjian
Дата:
Can we free only the plans we want to free in geqo?  I don't mind having
a different free method in geqo vs. the rest of the optimizer.

---------------------------------------------------------------------------

Tom Lane wrote:
> Kris Jurka <jurka@ejurka.com> writes:
> > [ GEQO doesn't work anymore in CVS tip ]
> 
> Ugh.  The proximate cause of this is the code I added recently to cache
> repeated calculations of the best inner indexscan for a given inner
> relation with potential outer relations.  Since geqo_eval() releases
> all memory acquired during construction of a possible jointree, it
> releases the cached path info too.  The next attempt to use the data
> fails.
> 
> Naturally, ripping out the cache again doesn't strike me as an appealing
> solution.
> 
> The narrowest fix would be to hack best_inner_indexscan() to switch into
> the context containing the parent RelOptInfo while it makes a cache
> entry.  This seems kinda klugy but it would work.
> 
> I wonder if we'd be better off not trying to reclaim memory in
> geqo_eval.  Aside from presenting a constant risk of this sort of
> problem whenever someone hacks the optimizer, what it's really doing
> is discarding a whole lot of join cost estimates that are likely to
> be done over again in (some of) the following calls of geqo_eval.
> GEQO would certainly be a lot faster if we didn't release that info,
> and I'm not sure that the space cost would be as bad as the code
> comments claim.  Any thoughts?
> 
> This really just points up how messy memory management in the optimizer
> is at present.  I wonder if anyone has ideas on improving it ...
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [HACKERS] GEQO Triggers Server Crash

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Can we free only the plans we want to free in geqo?  I don't mind having
> a different free method in geqo vs. the rest of the optimizer.

GEQO calls "the rest of the optimizer", and the space that we're
worried about is almost all allocated in "the rest of the optimizer".
How are you going to implement two different free methods?
        regards, tom lane