Обсуждение: Trouble with plpgsql on 7.4.6
Has anyone else had any problem installing plpgsql into a 7.4.6 database? When I try (on NetBSD 2.0RC5) I get the following message: createlang: language installation failed: server closed the connection unexpectedly This does not happen on 7.4.3, the other system that I have installed. It doesn't happen if I use 7.4.6 to access a database built by 7.4.3. It happens on 7.4.6 if I initdb a new database and simply run the createlang command against it. Here is a test script which I ran in a scratch directory. initdb . pg_ctl -D . start sleep 3 createlang plpgsql template1 The closest thing I could find in the archives is the message about 7.4.5 in http://archives.postgresql.org/pgsql-hackers/2004-09/msg00915.php. I don't think there is a connection but I could be wrong. I turned on a bunch of logging and this is what gets spit out into the log: Nov 18 11:49:18 panther cert198[21983]: [47-1] LOG: 00000: server process (PID 7864) was terminated by signal 6 Nov 18 11:49:18 panther cert198[21983]: [47-2] LOCATION: LogChildExit, postmaster.c:2087 Nov 18 11:49:18 panther cert198[21983]: [48-1] LOG: 00000: terminating any other active server processes Nov 18 11:49:18 panther cert198[21983]: [48-2] LOCATION: CleanupProc, postmaster.c:2008 Nov 18 11:49:18 panther cert198[21983]: [49-1] LOG: 00000: all server processes terminated; reinitializing Nov 18 11:49:18 panther cert198[21983]: [49-2] LOCATION: reaper, postmaster.c:1920 Nov 18 11:49:18 panther cert198[1332]: [52-1] LOG: 00000: database system was interrupted at 2004-11-18 11:43:53 EST Nov 18 11:49:18 panther cert198[1332]: [52-2] LOCATION: StartupXLOG, xlog.c:2610 Nov 18 11:49:18 panther cert198[1332]: [53-1] LOG: 00000: checkpoint record is at 0/9F5B40 Nov 18 11:49:18 panther cert198[1332]: [53-2] LOCATION: StartupXLOG, xlog.c:2628 Nov 18 11:49:18 panther cert198[1332]: [54-1] LOG: 00000: redo record is at 0/9F5B40; undo record is at 0/0; shutdown TRUE Nov 18 11:49:18 panther cert198[1332]: [54-2] LOCATION: StartupXLOG, xlog.c:2653 Nov 18 11:49:18 panther cert198[1332]: [55-1] LOG: 00000: next transaction ID: 574; next OID: 74486 Nov 18 11:49:18 panther cert198[1332]: [55-2] LOCATION: StartupXLOG, xlog.c:2656 Nov 18 11:49:18 panther cert198[1332]: [56-1] LOG: 00000: database system was not properly shut down; automatic recovery in progress Nov 18 11:49:18 panther cert198[1332]: [56-2] LOCATION: StartupXLOG, xlog.c:2705 Nov 18 11:49:19 panther cert198[1332]: [57-1] LOG: 00000: redo starts at 0/9F5B80 Nov 18 11:49:19 panther cert198[1332]: [57-2] LOCATION: StartupXLOG, xlog.c:2733 Nov 18 11:49:19 panther cert198[1332]: [58-1] LOG: 00000: invalid magic number 0000 in log file 0, segment 0, offset 10477568 Nov 18 11:49:19 panther cert198[1332]: [58-2] LOCATION: ValidXLOGHeader, xlog.c:2048 Nov 18 11:49:19 panther cert198[1332]: [59-1] LOG: 00000: redo done at 0/9FBCAC Nov 18 11:49:19 panther cert198[1332]: [59-2] LOCATION: StartupXLOG, xlog.c:2765 Nov 18 11:49:21 panther cert198[1332]: [60-1] LOG: 00000: database system is ready Nov 18 11:49:21 panther cert198[1332]: [60-2] LOCATION: StartupXLOG, xlog.c:2946 -- D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
"D'Arcy J.M. Cain" <darcy@druid.net> writes:
> Has anyone else had any problem installing plpgsql into a 7.4.6
> database? When I try (on NetBSD 2.0RC5) I get the following message:
> createlang: language installation failed: server closed the connection
> unexpectedly
Probably indicates a failure to load the plpgsql shared library.
> I turned on a bunch of logging and this is what gets spit out into the
> log:
> Nov 18 11:49:18 panther cert198[21983]: [47-1] LOG: 00000: server
> process (PID 7864) was terminated by signal 6
There wasn't anything interesting just before that?
Signal 6 is SIGABRT (at least on my machine) so something decided to
abort() in that backend. Postgres would abort() on assertion failure,
but it would have logged a message first. My guess is that the shared
library loader did the abort(), and probably it griped on stderr rather
than to syslog. Fix your postmaster startup arrangement so that stderr
goes someplace useful instead of to /dev/null, so you can see what it's
complaining about.
regards, tom lane
On Thu, 18 Nov 2004 14:25:41 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "D'Arcy J.M. Cain" <darcy@druid.net> writes:
> > Has anyone else had any problem installing plpgsql into a 7.4.6
> > database? When I try (on NetBSD 2.0RC5) I get the following
> > message:
>
> > createlang: language installation failed: server closed the
> > connection unexpectedly
>
> Probably indicates a failure to load the plpgsql shared library.
I considered that but the file is where I expect it to be. I did this:
[pgsql@panther:/data/cert198] $ pg_config --libdir
/usr/pkg/lib
The file is actually in /usr/pkg/lib/postgresql/plpgsql.so. I wasn't
sure if the postgresql was added so I made a symlink to assure that it
was available in both. locations.
> > I turned on a bunch of logging and this is what gets spit out into
> > the log:
>
> > Nov 18 11:49:18 panther cert198[21983]: [47-1] LOG: 00000: server
> > process (PID 7864) was terminated by signal 6
>
> There wasn't anything interesting just before that?
Not that I can see. I have attached the entire output of the capture
from the start. Here are the diffs from the postgresql.conf created by
initdb.
--- postgresql.conf.orig 2004-11-18 15:33:03.000000000 -0500
+++ postgresql.conf 2004-11-18 15:33:17.000000000 -0500
@@ -136,24 +136,24 @@
# - Syslog -
-#syslog = 0 # range 0-2; 0=stdout; 1=both; 2=syslog
-#syslog_facility = 'LOCAL0'
-#syslog_ident = 'postgres'
+syslog = 1 # range 0-2; 0=stdout; 1=both; 2=syslog
+syslog_facility = 'LOCAL4'
+syslog_ident = 'postgres'
# - When to Log -
-#client_min_messages = notice # Values, in order of decreasing detail:
+client_min_messages = debug5 # Values, in order of decreasing detail:
# debug5, debug4, debug3, debug2,
debug1,
# log, info, notice, warning, error
-#log_min_messages = notice # Values, in order of decreasing detail:
+log_min_messages = debug5 # Values, in order of decreasing detail:
# debug5, debug4, debug3, debug2,
debug1,
# info, notice, warning, error, log,
fatal,
# panic
-#log_error_verbosity = default # terse, default, or verbose messages
+log_error_verbosity = verbose # terse, default, or verbose messages
-#log_min_error_statement = panic # Values in order of increasing
severity:
+log_min_error_statement = debug5 # Values in order of increasing
severity:
# debug5, debug4, debug3, debug2,
debug1,
# info, notice, warning, error,
panic(off)
@@ -162,7 +162,7 @@
# milliseconds. Zero prints all
queries.
# Minus-one disables.
-#silent_mode = false # DO NOT USE without Syslog!
+silent_mode = false # DO NOT USE without Syslog!
# - What to Log -
@@ -170,11 +170,11 @@
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
-#log_connections = false
-#log_duration = false
-#log_pid = false
-#log_statement = false
-#log_timestamp = false
+log_connections = true
+log_duration = true
+log_pid = true
+log_statement = true
+log_timestamp = true
#log_hostname = false
#log_source_port = false
> Signal 6 is SIGABRT (at least on my machine) so something decided to
Same on mine.
Thanks for the help.
--
D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
Вложения
"D'Arcy J.M. Cain" <darcy@druid.net> writes:
> I thought that this would have sent everything to both the log and the
> screen but I found that the syslog has much more detail. I have
> attached that output.
We still need to look at the stderr output. All this says is that
you're getting repeated abort()s. It's unlikely that anything would
be coded to abort() without emitting any gripe at all --- but the gripe
is not appearing in syslog output, so stderr is the next place to look.
Also see about getting a stack trace from one of the core dumps.
regards, tom lane
On Sat, 20 Nov 2004 11:17:48 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "D'Arcy J.M. Cain" <darcy@druid.net> writes:
> > I thought that this would have sent everything to both the log and
> > the screen but I found that the syslog has much more detail. I have
> > attached that output.
>
> We still need to look at the stderr output. All this says is that
> you're getting repeated abort()s. It's unlikely that anything would
> be coded to abort() without emitting any gripe at all --- but the
> gripe is not appearing in syslog output, so stderr is the next place
> to look.
The stderr was in the previous message. No gripes there either other
than in the startup after the failure.
> Also see about getting a stack trace from one of the core dumps.
I did look at the core file and here is what I saw:
#0 0x483cafeb in kill () from /usr/lib/libc.so.12
#1 0x483cd0af in __libc_mutex_catchall_stub (m=1212478892) at /usr/src/lib/libc/thread-stub/thread-stub.c:112
#2 0x4843f0f7 in free (ptr=<incomplete type>) at /usr/src/lib/libc/stdlib/malloc.c:1149
#3 0x081b3efc in AllocSetDelete (context=<error type>) at aset.c:464
#4 0x081b468a in MemoryContextDelete (context=<error type>) at
#mcxt.c:192
In fact it is calling the thread stubs but the applicatino is threaded.
At least, that's what I see from this code in NetBSD.
#define CHECK_NOT_THREADED_ALWAYS() \
do { \ if (__isthreaded) \ DIE(); \
} while (/*CONSTCOND*/0)
#if 1
#define CHECK_NOT_THREADED() CHECK_NOT_THREADED_ALWAYS()
#else
#define CHECK_NOT_THREADED() /* nothing */
#endif
I am going to follow up with the NetBSD team but I pass this on in case
it suggests anything. Note that this still only causes problems on
7.4.6 and not under 7.4.3.
--
D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
On Tue, Nov 23, 2004 at 07:25:17AM -0500, D'Arcy J.M. Cain wrote:
> The stderr was in the previous message. No gripes there either other
> than in the startup after the failure.
>
> > Also see about getting a stack trace from one of the core dumps.
>
> I did look at the core file and here is what I saw:
>
> #0 0x483cafeb in kill () from /usr/lib/libc.so.12
> #1 0x483cd0af in __libc_mutex_catchall_stub (m=1212478892)
> at /usr/src/lib/libc/thread-stub/thread-stub.c:112
> #2 0x4843f0f7 in free (ptr=<incomplete type>)
> at /usr/src/lib/libc/stdlib/malloc.c:1149
> #3 0x081b3efc in AllocSetDelete (context=<error type>) at aset.c:464
> #4 0x081b468a in MemoryContextDelete (context=<error type>) at
> #mcxt.c:192
Would setting the following environment variable get you an earlier
abort / more logging? (pthread(3))
PTHREAD_DIAGASSERT Possible values are any combinations of: A
Ignoreerrors. a Abort on errors, creating a core
dump for further debugging. E Do not log errors to stdout.
e Log errors to stdout. L Do not log
errorsvia syslogd(8). l Log
errorsvia syslogd(8).
Cheers,
Patrick
On Tue, 23 Nov 2004 07:25:17 -0500
"D'Arcy J.M. Cain" <darcy@druid.net> wrote:
>
> #0 0x483cafeb in kill () from /usr/lib/libc.so.12
> #1 0x483cd0af in __libc_mutex_catchall_stub (m=1212478892)
> at /usr/src/lib/libc/thread-stub/thread-stub.c:112
> #2 0x4843f0f7 in free (ptr=<incomplete type>)
> at /usr/src/lib/libc/stdlib/malloc.c:1149
> #3 0x081b3efc in AllocSetDelete (context=<error type>) at aset.c:464
> #4 0x081b468a in MemoryContextDelete (context=<error type>) at
> #mcxt.c:192
>
> In fact it is calling the thread stubs but the applicatino is
> threaded. At least, that's what I see from this code in NetBSD.
>
> #define CHECK_NOT_THREADED_ALWAYS() \
> do { \
> if (__isthreaded) \
> DIE(); \
> } while (/*CONSTCOND*/0)
>
> #if 1
> #define CHECK_NOT_THREADED() CHECK_NOT_THREADED_ALWAYS()
> #else
> #define CHECK_NOT_THREADED() /* nothing */
> #endif
>
> I am going to follow up with the NetBSD team but I pass this on in
> case it suggests anything. Note that this still only causes problems
> on 7.4.6 and not under 7.4.3.
OK, I have found the problem. It turns out to be in the way that it is
built under the NetBSD pkgsrc system. As soon as a module is loaded
that uses pthreads it sets a variable (__isthreaded) that causes the
above error. The answer is to simply link PostgreSQL with -lpthread to
begin with. We are making some changes to the pkgsrc system to do this
correctly with this and some other packages that could potentially have
the same problem.
Thanks for the help in finding this.
--
D'Arcy J.M. Cain <darcy@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.