Обсуждение: bug: core dump in pl/perl (cvs head).
we have found a bug in CVS head using PL/Perl: [hs@hp hs]$ psql test < /tmp/core.sql DROP FUNCTION CREATE FUNCTION NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL AND pid = pg_backend_pid() server closed the connection unexpectedly This probably means the server terminated abnormally before or whileprocessing the request. connection to server was lost DROP FUNCTION func(); CREATE OR REPLACE FUNCTION func() RETURNS int4 AS ' $sql = "SELECT 10, 10 FROM pg_locks " . "WHEREtransaction IS NOT NULL AND pid = pg_backend_pid() "; elog(NOTICE, "sql: $sql"); my $rv = spi_exec_query($sql); return 0; ' LANGUAGE 'plperlu'; SELECT func(); somehow Perl does not seem to like the SPI. only the development code seems to be affected. thanks a lot, hans
On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: > > we have found a bug in CVS head using PL/Perl: How current is your checkout? Mine's from this morning and I don't get a crash with the example you posted: test=> SELECT func(); NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL AND pid = pg_backend_pid() func ------ 0 (1 row) Have you run "make distclean" and then done a fresh build? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote: > On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: > >>we have found a bug in CVS head using PL/Perl: > > > How current is your checkout? Mine's from this morning and I don't > get a crash with the example you posted: > > test=> SELECT func(); > NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL AND pid = pg_backend_pid() > func > ------ > 0 > (1 row) > > Have you run "make distclean" and then done a fresh build? > Yes, this was my first concern ... - everything has been build cleanly ... I think this is yesterday's CVS ... hans
Michael Fuhr <mike@fuhr.org> writes: > On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: >> we have found a bug in CVS head using PL/Perl: > How current is your checkout? Mine's from this morning and I don't > get a crash with the example you posted: None for me either (trying on a Fedora Core 3 x86 box; don't currently have plperl built on my other machine). Platform-specific issue maybe? regards, tom lane
On Tue, Jul 05, 2005 at 04:37:05PM -0400, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: > >> we have found a bug in CVS head using PL/Perl: > > > How current is your checkout? Mine's from this morning and I don't > > get a crash with the example you posted: > > None for me either (trying on a Fedora Core 3 x86 box; don't currently > have plperl built on my other machine). Platform-specific issue maybe? My platform that works is Solaris 9/sparc with Perl 5.8.7; compiler is gcc 3.4.2. Did you get a core dump? If so, can you do a stack trace on it? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote: > On Tue, Jul 05, 2005 at 04:37:05PM -0400, Tom Lane wrote: > >>Michael Fuhr <mike@fuhr.org> writes: >> >>>On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: >>> >>>>we have found a bug in CVS head using PL/Perl: >> >>>How current is your checkout? Mine's from this morning and I don't >>>get a crash with the example you posted: >> >>None for me either (trying on a Fedora Core 3 x86 box; don't currently >>have plperl built on my other machine). Platform-specific issue maybe? > > > My platform that works is Solaris 9/sparc with Perl 5.8.7; compiler > is gcc 3.4.2. > > Did you get a core dump? If so, can you do a stack trace on it? > I am running FC4 on x86. There is no core dump. I have compiled it all over again cleanly. The last elog is never reached so something seems to go wrong inside the SQL call ... CREATE OR REPLACE FUNCTION trig_func() RETURNS trigger AS ' # select the current transaction id elog(NOTICE,"trigger starting ..."); $sql = "SELECT transaction, pid FROM pg_locks " . "WHERE transactionIS NOT NULL AND pid = pg_backend_pid() "; elog(NOTICE, "sql: $sql"); my $rv = spi_exec_query($sql); elog(NOTICE, "lookingfor transactions ..."); I don't know what it is yet as it seemed to work with 8.0.3. Interesting to see that nodody else has similar problems ... regards, hans
i have never seen this before ... after rebooting the box (for some other reason) it worked. somehow there has been something else going terribly wrong ... sorry for the confusion ... best regards, hans Hans-Jürgen Schönig wrote: > Michael Fuhr wrote: > >> On Tue, Jul 05, 2005 at 04:37:05PM -0400, Tom Lane wrote: >> >>> Michael Fuhr <mike@fuhr.org> writes: >>> >>>> On Tue, Jul 05, 2005 at 09:54:52PM +0200, Hans-Jürgen Schönig wrote: >>>> >>>>> we have found a bug in CVS head using PL/Perl: >>> >>> >>>> How current is your checkout? Mine's from this morning and I don't >>>> get a crash with the example you posted: >>> >>> >>> None for me either (trying on a Fedora Core 3 x86 box; don't currently >>> have plperl built on my other machine). Platform-specific issue maybe? >> >> >> >> My platform that works is Solaris 9/sparc with Perl 5.8.7; compiler >> is gcc 3.4.2. >> >> Did you get a core dump? If so, can you do a stack trace on it? >> > > > I am running FC4 on x86. There is no core dump. > I have compiled it all over again cleanly. > The last elog is never reached so something seems to go wrong inside the > SQL call ... > > > > CREATE OR REPLACE FUNCTION trig_func() RETURNS trigger AS ' > # select the current transaction id > elog(NOTICE, "trigger starting ..."); > $sql = "SELECT transaction, pid FROM pg_locks " > . "WHERE transaction IS NOT NULL AND pid = > pg_backend_pid() "; > elog(NOTICE, "sql: $sql"); > my $rv = spi_exec_query($sql); > elog(NOTICE, "looking for transactions ..."); > > > I don't know what it is yet as it seemed to work with 8.0.3. > Interesting to see that nodody else has similar problems ... > > regards, > > hans > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings
On Tue, Jul 05, 2005 at 10:49:54PM +0200, Hans-Jürgen Schönig wrote: > > I am running FC4 on x86. There is no core dump. Just to be sure: where are you looking? I don't know if it's configurable on your platform, but I usually find core dumps under $PGDATA/base/<databaseoid>/. Do you have a resource limit or other setting that would prevent core dumps from happening? Tom Lane can probably help out on that platform. > I have compiled it all over again cleanly. > The last elog is never reached so something seems to go wrong inside the > SQL call ... Can you run the query in psql? What about in a PL/pgSQL function? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr <mike@fuhr.org> writes: > On Tue, Jul 05, 2005 at 10:49:54PM +0200, Hans-Jürgen Schönig wrote: >> I am running FC4 on x86. There is no core dump. > Just to be sure: where are you looking? I don't know if it's > configurable on your platform, but I usually find core dumps under > $PGDATA/base/<databaseoid>/. CVS tip of the last forty-eight hours or so would dump into $PGDATA instead. It's a good bet though that "ulimit -c" is set to 0 (no dumps) by default on that platform. I tend to put ulimit -c unlimited into the postmaster startup script on Linux machines, so that I can get core dumps. regards, tom lane
Has this been fixed? --------------------------------------------------------------------------- Hans-Jürgen Schönig wrote: > we have found a bug in CVS head using PL/Perl: > > [hs@hp hs]$ psql test < /tmp/core.sql > DROP FUNCTION > CREATE FUNCTION > NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL > AND pid = pg_backend_pid() > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > connection to server was lost > > > > DROP FUNCTION func(); > CREATE OR REPLACE FUNCTION func() RETURNS int4 AS ' > $sql = "SELECT 10, 10 FROM pg_locks " > . "WHERE transaction IS NOT NULL AND pid = > pg_backend_pid() "; > elog(NOTICE, "sql: $sql"); > my $rv = spi_exec_query($sql); > return 0; > ' LANGUAGE 'plperlu'; > > SELECT func(); > > > somehow Perl does not seem to like the SPI. > only the development code seems to be affected. > > thanks a lot, > > hans > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Fri, Jul 29, 2005 at 10:46:35PM -0400, Bruce Momjian wrote: > > Has this been fixed? > > --------------------------------------------------------------------------- > > Hans-Jürgen Schönig wrote: > > we have found a bug in CVS head using PL/Perl: > > > > [hs@hp hs]$ psql test < /tmp/core.sql > > DROP FUNCTION > > CREATE FUNCTION > > NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL > > AND pid = pg_backend_pid() > > server closed the connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > connection to server was lost I don't think anybody was able to reproduce the problem, and Hans-Jürgen said that it started working after he rebooted the box: http://archives.postgresql.org/pgsql-bugs/2005-07/msg00062.php There was reportedly no core dump and apparently the reboot destroyed whatever conditions were causing the problem, so it might be difficult to find out what was happening :-( Hans-Jürgen, did this problem ever reappear? Somebody else recently reported a backend crash when calling PL/Perl in 8.0.3, but again we've been given no core dump info and thus far nobody has been able to duplicate the problem: http://archives.postgresql.org/pgsql-novice/2005-07/msg00181.php -- Michael Fuhr http://www.fuhr.org/~mfuhr/
no, this one seems to be fine now. i have no idea what went wrong. the code cored at some very "unlikely" place. somehow rebooting helped ... - this was very strange. fortunately it works nicely now ... many thanks a best regards, hans Michael Fuhr wrote: > On Fri, Jul 29, 2005 at 10:46:35PM -0400, Bruce Momjian wrote: > >>Has this been fixed? >> >>--------------------------------------------------------------------------- >> >>Hans-Jürgen Schönig wrote: >> >>>we have found a bug in CVS head using PL/Perl: >>> >>>[hs@hp hs]$ psql test < /tmp/core.sql >>>DROP FUNCTION >>>CREATE FUNCTION >>>NOTICE: sql: SELECT 10, 10 FROM pg_locks WHERE transaction IS NOT NULL >>>AND pid = pg_backend_pid() >>>server closed the connection unexpectedly >>> This probably means the server terminated abnormally >>> before or while processing the request. >>>connection to server was lost > > > I don't think anybody was able to reproduce the problem, and > Hans-Jürgen said that it started working after he rebooted > the box: > > http://archives.postgresql.org/pgsql-bugs/2005-07/msg00062.php > > There was reportedly no core dump and apparently the reboot destroyed > whatever conditions were causing the problem, so it might be difficult > to find out what was happening :-( > > Hans-Jürgen, did this problem ever reappear? > > Somebody else recently reported a backend crash when calling PL/Perl > in 8.0.3, but again we've been given no core dump info and thus far > nobody has been able to duplicate the problem: > > http://archives.postgresql.org/pgsql-novice/2005-07/msg00181.php >
On Sat, Jul 30, 2005 at 02:43:23PM +0200, Hans-Jürgen Schönig wrote: > no, this one seems to be fine now. > i have no idea what went wrong. > the code cored at some very "unlikely" place. Are you saying that it *did* dump core, or that it *didn't*? It's possible that your coredumpsize resource limit prevented a core dump from happening at all. If you're playing with the development code than it would be a good idea to set that limit so that you do get core dumps. Configuring with --enable-debug and --enable-cassert would also be useful, if you aren't doing so already. > somehow rebooting helped ... - this was very strange. > fortunately it works nicely now ... ...but unfortunate for finding out what was happening :-( Had you stopped and restarted the postmaster before the reboot? That is, was the reboot really necessary to "fix" the problem? Had you run "make install" while the postmaster was still running? I haven't tested whether that could cause a problem, but I wonder if that's possible. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote: > On Sat, Jul 30, 2005 at 02:43:23PM +0200, Hans-Jürgen Schönig wrote: > >>no, this one seems to be fine now. >>i have no idea what went wrong. >>the code cored at some very "unlikely" place. > > > Are you saying that it *did* dump core, or that it *didn't*? It's > possible that your coredumpsize resource limit prevented a core > dump from happening at all. If you're playing with the development > code than it would be a good idea to set that limit so that you do > get core dumps. Configuring with --enable-debug and --enable-cassert > would also be useful, if you aren't doing so already. > it cored; my core-size settings are ok ... > >>somehow rebooting helped ... - this was very strange. >>fortunately it works nicely now ... > > > ...but unfortunate for finding out what was happening :-( absolutely. however, it is a bit hard to find a bug which cannot be reproduced ... to me it seems as if something else was doing some sort of crap on this box. > Had you stopped and restarted the postmaster before the reboot? yes - I even did a make distclean and a recompile ... > That is, was the reboot really necessary to "fix" the problem? I rebooted for some other reasons ... > Had you run "make install" while the postmaster was still running? > I haven't tested whether that could cause a problem, but I wonder > if that's possible. To be honest; I don't have the slightest idea; I have never seen this before and I have been able to reproduce that. I don't have the slightest idea what happened. This seems like higher power ... Best regards, Hans
On Sat, Jul 30, 2005 at 03:46:46PM +0200, Hans-Jürgen Schönig wrote: > Michael Fuhr wrote: > >Are you saying that it *did* dump core, or that it *didn't*? > > it cored; my core-size settings are ok ... So were you able to get a stack trace from the core dump? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
[Please copy the mailing list on replies.] On Sat, Jul 30, 2005 at 05:20:07PM +0200, Hans-Jürgen Schönig wrote: > Michael Fuhr wrote: > >So were you able to get a stack trace from the core dump? > > yes, but i did not keep the backtrace. > i did not expect the db to work after the reboot. > > you shouldn't worry to much about that. i don't believe it is a > postgresql related problem ... Maybe not, but if it happens again then please post the stack trace so somebody can take a look. Even if it isn't a PostgreSQL problem, it's worth finding out the cause so you can prevent it in the future. -- Michael Fuhr http://www.fuhr.org/~mfuhr/