Обсуждение: PLTCL return_null crash...
This is so odd. I have used return_null with no problems, but now it crashes stuff all over the place. Yes, I am the guy who hacked his pltcl.c, but only on one machine. This seems to crash on all 3 of them. PostgreSQL 7.2.1 on i386--netbsdelf, compiled by GCC egcs-1.1.2 bash-2.05$ createlang 'pltcl' test; bash-2.05$ psql test Welcome to psql, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit test=# create function crash () returns int as ' test'# return_null test'# ' language 'pltcl'; CREATE test=# select crash(); ERROR: AllocSetFree: cannot find block containing chunk 0xbfbfcc48 test=# Well, crash may be too harsh a term in this simple example, in others, however, it not only brings down my database (withouta core file?) it kills the webserver! I can accept the notion that I have probably caused this, but I don't know how! These computers are geographically separated,and the one above is stone stock, no changes at all to anything (except NetBSD package stuff...) Does the above exhibit any similar wierdness on anyone else's 7.2.1? Ian A. Harding Programmer/Analyst II Tacoma-Pierce County Health Department (253) 798-3549 iharding@tpchd.org WWSD - What Would Scooby Doo?
"Ian Harding" <ianh@tpchd.org> writes: > test=# create function crash () returns int as ' > test'# return_null > test'# ' language 'pltcl'; > CREATE > test=# select crash(); > ERROR: AllocSetFree: cannot find block containing chunk 0xbfbfcc48 > test=# Hmm. WorksForMe (on both 7.2.3 and CVS tip) ... regression=# select crash(); crash ------- (1 row) regression=# select crash() is null; ?column? ---------- t (1 row) regression=# A stack backtrace from the elog() call might prove enlightening. regards, tom lane
On Mon, 7 Oct 2002, Ian Harding wrote: > [deleted] > > test=# create function crash () returns int as ' > test'# return_null > test'# ' language 'pltcl'; > CREATE > test=# select crash(); > ERROR: AllocSetFree: cannot find block containing chunk 0xbfbfcc48 > test=# > > [deleted] > > Does the above exhibit any similar wierdness on anyone else's 7.2.1? > Crashes for me too. Not under 7.3 though so something changed somewhere, somehow :) -- Nigel J. Andrews
On Mon, 7 Oct 2002, Tom Lane wrote: > "Ian Harding" <ianh@tpchd.org> writes: > > test=# create function crash () returns int as ' > > test'# return_null > > test'# ' language 'pltcl'; > > CREATE > > test=# select crash(); > > ERROR: AllocSetFree: cannot find block containing chunk 0xbfbfcc48 > > test=# > > Hmm. WorksForMe (on both 7.2.3 and CVS tip) ... > > regression=# select crash(); > crash > ------- > > (1 row) > > regression=# select crash() is null; > ?column? > ---------- > t > (1 row) > > regression=# > > > A stack backtrace from the elog() call might prove enlightening. Here's one from my system. Program received signal SIGSEGV, Segmentation fault. 0x8156df7 in pfree () (gdb) bt #0 0x8156df7 in pfree () #1 0x4001611c in ?? () from /usr/local/stow/pgsql-7.2.1/lib/pltcl.so #2 0x40015c37 in ?? () from /usr/local/stow/pgsql-7.2.1/lib/pltcl.so #3 0x80c4d1d in ExecMakeFunctionResult () #4 0x80c4dda in ExecEvalFunc () #5 0x80c5310 in ExecEvalExpr () #6 0x80c55e9 in ExecTargetList () #7 0x80c587b in ExecProject () #8 0x80cb073 in ExecResult () #9 0x80c3d79 in ExecProcNode () #10 0x80c2d5e in ExecutePlan () #11 0x80c23f7 in ExecutorRun () #12 0x810f935 in ProcessQuery () #13 0x810e1e0 in pg_exec_query_string () #14 0x810f1be in PostgresMain () #15 0x80f631e in DoBackend () #16 0x80f5c6f in BackendStartup () #17 0x80f4e8c in ServerLoop () #18 0x80f4a0b in PostmasterMain () #19 0x80d44a5 in main () #20 0x400e6a42 in __libc_start_main () from /lib/libc.so.6 (gdb) So we can see it's in pltcl but without debugging turned on it's a little difficult to tell where. Presumably the fault was removed between 1.48 and 1.49 of src/pl/tcl/pltcl.c -- Nigel J. Andrews
"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > Presumably the fault was removed between 1.48 and 1.49 of src/pl/tcl/pltcl.c But 1.49 is in 7.2.1, which you said you're using? regards, tom lane
Tom Lane wrote: > "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > >>Presumably the fault was removed between 1.48 and 1.49 of src/pl/tcl/pltcl.c > > > But 1.49 is in 7.2.1, which you said you're using? > It crashes for me under 7.2.2 and 7.2.3 (but not in 7.3b2). The odd thing is, even though I compiled --enable-debug, pltcl.so still seems to lack debug symbols: #0 0x08166774 in pfree (pointer=0x8397450) at mcxt.c:448 #1 0x40028033 in pltcl_func_handler () from /usr/lib/pgsql/pltcl.so #2 0x40027b8b in pltcl_call_handler () from /usr/lib/pgsql/pltcl.so #3 0x080c96e0 in ExecMakeFunctionResult (fcache=0x8384728, arguments=0x0, econtext=0x8384470, isNull=0xbfffebaf "", isDone=0xbfffebb0) at execQual.c:825 I tried putting a break in pltcl_func_handler, but here's what I get: Breakpoint 1, 0x40027bea in pltcl_func_handler () from /usr/lib/pgsql/pltcl.so (gdb) step Single stepping until exit from function pltcl_func_handler, which has no line number information. Any idea wht I can't step through this? In any case, the problem seems to be in this section of code: <snip> if (SPI_finish() != SPI_OK_FINISH) elog(ERROR, "pltcl: SPI_finish() failed"); UTF_BEGIN; if (fcinfo->isnull) retval = (Datum) 0; else retval = FunctionCall3(&prodesc->result_in_func, PointerGetDatum(UTF_U2E(interp->result)), ObjectIdGetDatum(prodesc->result_in_elem), Int32GetDatum(-1)); UTF_END; </snip> where: #define UTF_BEGIN do { \ unsigned char *_pltcl_utf_src; \ unsigned char *_pltcl_utf_dst #define UTF_END if (_pltcl_utf_src!=_pltcl_utf_dst) \ pfree(_pltcl_utf_dst); } while (0) I was able to step into, and out of, SPI_finish(). The pfree(_pltcl_utf_dst) seems to be where it is failing. Joe
On Mon, 7 Oct 2002, Tom Lane wrote: > "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > > Presumably the fault was removed between 1.48 and 1.49 of src/pl/tcl/pltcl.c > > But 1.49 is in 7.2.1, which you said you're using? Ok, I miss understood the labeling. -- Nigel J. Andrews
Joe Conway <mail@joeconway.com> writes: > Any idea wht I can't step through this? In any case, the problem seems to be > in this section of code: > <snip> > if (SPI_finish() != SPI_OK_FINISH) > elog(ERROR, "pltcl: SPI_finish() failed"); > UTF_BEGIN; > if (fcinfo->isnull) > retval = (Datum) 0; > else > retval = FunctionCall3(&prodesc->result_in_func, > PointerGetDatum(UTF_U2E(interp->result)), > ObjectIdGetDatum(prodesc->result_in_elem), > Int32GetDatum(-1)); > UTF_END; > </snip> Oh, but of course: if you are returning NULL then this sequence fails because it pfrees an uninitialized pointer. It's fixed in CVS tip, where the sequence reads like if (SPI_finish() != SPI_OK_FINISH) elog(ERROR, "pltcl: SPI_finish() failed"); if (fcinfo->isnull) retval = (Datum) 0; else { UTF_BEGIN; retval = FunctionCall3(&prodesc->result_in_func, PointerGetDatum(UTF_U2E(interp->result)), ObjectIdGetDatum(prodesc->result_in_elem), Int32GetDatum(-1)); UTF_END; } The reason I failed to duplicate it here was I didn't compile with --enable-multibyte. The bug is definitely still there in 7.2.3 if you use multibyte. regards, tom lane