Обсуждение: latest snapshot crashes backend

Поиск
Список
Период
Сортировка

latest snapshot crashes backend

От
"Oliver Elphick"
Дата:
The attached commands crash the backend, which exits with status 11.  The
backend log (debug level 255) does not show anything that is meaningful to
me.  It shows a query, which seems to relate to the final constraint, and
then terminates without further explanation:

...
query: select 1 from individual where NOT ( surname IS NULL AND forenames IS 
NUL
L )                    
parser outputs:            
{ QUERY             
...  }                             
after rewriting:            
{ QUERY   
...                         }                             
/usr/lib/postgresql/bin/postmaster: reaping dead processes...
/usr/lib/postgresql/bin/postmaster: CleanupProc: pid 8970 exited with status 11
...


I don't know where to start looking.  I can trace the backend with the
debugger if you will tell me which routine to break at.
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Lo, children are an heritage of the LORD; and
the      fruit of the womb is his reward."        Psalms 127:3  

Re: [HACKERS] latest snapshot crashes backend

От
"Thomas G. Lockhart"
Дата:
> The attached commands crash the backend, which exits with status 11.

tgl=> select * from x where not (i is null or c is null);
i|c
-+-
1|T
2|A
0|T
(3 rows)

tgl=> select * from x where not (i is null and c is null);
pqReadData() -- backend closed the channel unexpectedly.

So, let's try rewriting it as a workaround:

tgl=> select * from x where (not i is null) or (not c is null);
pqReadData() -- backend closed the channel unexpectedly.

Oops. For some reason the NOT/AND is fatal, while NOT/OR is OK. That's
not so good. The good news is that it will certainly be repairable with
patches.
                   - Tom


Re: [HACKERS] latest snapshot crashes backend

От
Bruce Momjian
Дата:
> > The attached commands crash the backend, which exits with status 11.
> 
> tgl=> select * from x where not (i is null or c is null);
> i|c
> -+-
> 1|T
> 2|A
> 0|T
> (3 rows)
> 
> tgl=> select * from x where not (i is null and c is null);
> pqReadData() -- backend closed the channel unexpectedly.
> 
> So, let's try rewriting it as a workaround:
> 
> tgl=> select * from x where (not i is null) or (not c is null);
> pqReadData() -- backend closed the channel unexpectedly.
> 
> Oops. For some reason the NOT/AND is fatal, while NOT/OR is OK. That's
> not so good. The good news is that it will certainly be repairable with
> patches.
> 

Care to give us a table:
select * from pg_shadow where (usesysid is null and oid is null)\g

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] latest snapshot crashes backend

От
"Thomas G. Lockhart"
Дата:
> Care to give us a table:
>  select * from pg_shadow where (usesysid is null and oid is null);

Sure, that's easy. Just negate your where clause:

tgl=> select * from pg_shadow
tgl-> where not (usesysid is null and oid is null);
pqReadData() -- backend closed the channel unexpectedly.

If you can reproduce this, will you have a chance to look at it? I don't
know where the problem is, but suspect that it is farther back than the
parser transformations. Rewrite system, optimizer, or executor??
                   - Tom


Re: [HACKERS] latest snapshot crashes backend

От
Bruce Momjian
Дата:
> > Care to give us a table:
> >  select * from pg_shadow where (usesysid is null and oid is null);
> 
> Sure, that's easy. Just negate your where clause:
> 
> tgl=> select * from pg_shadow
> tgl-> where not (usesysid is null and oid is null);
> pqReadData() -- backend closed the channel unexpectedly.
> 
> If you can reproduce this, will you have a chance to look at it? I don't
> know where the problem is, but suspect that it is farther back than the
> parser transformations. Rewrite system, optimizer, or executor??

I will look at it.  I can reproduce it here.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] latest snapshot crashes backend

От
"Thomas G. Lockhart"
Дата:
> > tgl=> select * from pg_shadow
> > tgl-> where not (usesysid is null and oid is null);
> > pqReadData() -- backend closed the channel unexpectedly.

btw, you may already have noticed that this doesn't crash if the same
clause is changed to be a target:

tgl=> select not (usesysid is null and oid is null) from pg_shadow;
?column?
--------
t
t
(2 rows)
                  - Thomas


Re: [HACKERS] latest snapshot crashes backend

От
Bruce Momjian
Дата:
> > Care to give us a table:
> >  select * from pg_shadow where (usesysid is null and oid is null);
> 
> Sure, that's easy. Just negate your where clause:
> 
> tgl=> select * from pg_shadow
> tgl-> where not (usesysid is null and oid is null);
> pqReadData() -- backend closed the channel unexpectedly.
> 
> If you can reproduce this, will you have a chance to look at it? I don't
> know where the problem is, but suspect that it is farther back than the
> parser transformations. Rewrite system, optimizer, or executor??

Here is the problem:(gdb) print ((Expr *) clause)->oper$1 = (Node *) 0x0

We have dealt with this before in relation to NOT.  The system assumes
NOT is an OP_EXPR oper, while it really isn't.

I fixed another query about a month ago relating to this.  In that case,
flatten_tlistentry() was loosing information about EXPR nodes.

In this case, the code:
       Oid         opno = ((Oper *) ((Expr *) clause)->oper)->opno;

clearly is making an assumption it should not be making.

In this case, clause is:
(gdb) print ((Expr *)clause)[0]$3 = {type = T_Expr, typeOid = 0, opType = NOT_EXPR, oper = 0x0,   args = 0x8307f90}

where opType is not OP_EXPR, but NOT_EXPR.

I am inclined to check for NOT_EXPR, let the selectivity be computed on
the subclause, and negate(NOT) the resulting selectivity.  Not sure on
an exact fix yet, but this is a 6.4.1 issue, anyway.


Problem is in the optimizer:

---------------------------------------------------------------------------


#0  0x80bb1c4 in compute_selec (root=0x82d8590, clauses=0x8307fb0,    or_selectivities=0x0) at clausesel.c:274
#1  0x80bb0cb in compute_clause_selec (root=0x82d8590, clause=0x830ea50,    or_selectivities=0x0) at clausesel.c:182
#2  0x80bfc9c in add_clause_to_rels (root=0x82d8590, clause=0x830ea50)   at initsplan.c:211
#3  0x80bfbea in init_base_rels_qual (root=0x82d8590, clauses=0x830f050)   at initsplan.c:155
#4  0x80c01bb in subplanner (root=0x82d8590, flat_tlist=0x830f170,    qual=0x830f050) at planmain.c:262
#5  0x80c00f8 in query_planner (root=0x82d8590, command_type=1,    tlist=0x83077d0, qual=0x83063d0) at planmain.c:176
#6  0x80c085d in union_planner (parse=0x82d8590) at planner.c:151
#7  0x80c06d5 in planner (parse=0x82d8590) at planner.c:72
#8  0x80f56b4 in pg_parse_and_plan (   query_string=0x80455e4 "select * from pg_shadow where not (usesysid is null and
oidis null);\n", typev=0x0, nargs=0, queryListP=0x8045594, dest=Debug,    aclOverride=0 '\000') at postgres.c:628
 
#9  0x80f57f3 in pg_exec_query_dest (   query_string=0x80455e4 "select * from pg_shadow where not (usesysid is null and
oidis null);\n", dest=Debug, aclOverride=0) at postgres.c:722
 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] latest snapshot crashes backend

От
Bruce Momjian
Дата:
> > Care to give us a table:
> >  select * from pg_shadow where (usesysid is null and oid is null);
> 
> Sure, that's easy. Just negate your where clause:
> 
> tgl=> select * from pg_shadow
> tgl-> where not (usesysid is null and oid is null);
> pqReadData() -- backend closed the channel unexpectedly.
> 
> If you can reproduce this, will you have a chance to look at it? I don't
> know where the problem is, but suspect that it is farther back than the
> parser transformations. Rewrite system, optimizer, or executor??
> 

OK, here is the fix.  I am not applying it yet until 6.4 is released by
Marc, OK?  In fact, I am not sure how we are going to do patch
application after 6.4, so would someone else please apply this?

---------------------------------------------------------------------------

*** ./backend/optimizer/path/clausesel.c.orig    Wed Nov  4 16:49:35 1998
--- ./backend/optimizer/path/clausesel.c    Wed Nov  4 17:11:02 1998
***************
*** 254,259 ****
--- 254,264 ----          */         s1 = 0.1;     }
+     else if (not_clause((Node *) clause))
+     {
+         /* negate this baby */
+         return 1 - compute_selec(root, ((Expr *)clause)->args, or_selectivities);
+     }     else if (is_subplan((Node *) clause))     { 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026