Re: hung postmaster?

Поиск
Список
Период
Сортировка
От Ed L.
Тема Re: hung postmaster?
Дата
Msg-id 200502181721.44404.pgsql@bluepolka.net
обсуждение исходный текст
Ответ на Re: hung postmaster?  ("Ed L." <pgsql@bluepolka.net>)
Ответы Re: hung postmaster?  ("Ed L." <pgsql@bluepolka.net>)
Список pgsql-general
OK, it appears I can reproduce this bug in fairly short
order.  Below are gdb backtraces along with current
snapshots from ps, netstat, and a snippet of the server
log.  This is no longer an urgent issue for me with the
gcc 3.4.2 workaround available, but I do have a stalled
test cluster postmaster right now, so I can leave it up
for a while if anyone cares for more information.

The identical source built in the identical fashion and
running on the same hardware, but using gcc 3.4.2
instead of gcc 3.3.2, continues to work fine and does
not exhibit this problem so far.  Again, this is 64-bit
PostgreSQL 7.4.6 on HP-UX B.11.23 on ia64 box.

Details of current hang...

PIDS 29080 and 26752 in the listing below are hung,
apparently because the postmaster is hung (PID 28775).
PID 26752 is a remote psql client that wanted to just
connect, select version(), and disconnect.  I had
that going in a loop, and this PID was the first to
hang; PID 29080 is a local psql client.

$ps -u pg -lf
F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
1401 S       pg 28777 28775  0 154 20 e000000170fcf4c0 1162 e000000164f7e0c0 13:41:07 pts/3     0:00 postgres: stats
bufferprocess 
1401 S       pg 28775     1  0 154 20 e00000016a5b5940 1118 e0000001744340e8 13:41:07 pts/3     0:00
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/postm
1401 S       pg 26752 28629  0 154 20 e000000191e03280  101 e000000164f7e100 18:09:01 pts/3     0:00 psql -l
401 R       pg  7130 26918  1 178 20 e00000016adac4c0   68                - 18:29:42 pts/11    0:00 ps -u pg -lf
1401 S       pg 28778 28777  0 154 20 e00000016c591700 1130 e000000164f7e100 13:41:07 pts/3     0:00 postgres: stats
collectorprocess 
401 S       pg 28629  4112  0 158 20 e00000016aecc940  333 e000000170f9c000 13:40:53 pts/3     0:00 -sh
401 S       pg 26918 26887  0 158 20 e00000016c2a0280  351 e000000171506000 18:09:24 pts/11    0:00 -sh
1421 T       pg 29080 26918  0 154 20 e00000016a90f940  101                - 18:13:25 pts/11    0:00 psql -c select
version()

$which postmaster
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/postmaster

$gdb `which postmaster`
HP gdb 5.0 for HP Itanium (32 or 64 bit) and target HP-UX 11.2x.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.0 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
(gdb) attach 28775
Attaching to program: /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/postmaster, process 28775
Reading symbols from /usr/lib/hpux64/libxnet.so.1...done.
Reading symbols from /usr/lib/hpux64/libc.so.1...done.
Reading symbols from /usr/lib/hpux64/libgen.so.1...done.
Reading symbols from /usr/lib/hpux64/libdl.so.1...done.
Reading symbols from /usr/lib/hpux64/libnsl.so.1...done.
Reading symbols from /usr/lib/hpux64/libm.so.1...done.
Reading symbols from /usr/lib/hpux64/libxti.so.1...done.
Reading symbols from /usr/lib/hpux64/libnss_files.so.1...done.
0xc000000000304230:0 in _accept_sys+0x30 () from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc000000000304230:0 in _accept_sys+0x30 () from /usr/lib/hpux64/libc.so.1
#1  0xc0000000003100b0:0 in accept+0x150 () from /usr/lib/hpux64/libc.so.1
#2  0xc000000001aac450:0 in accept+0x70 () from /usr/lib/hpux64/libxnet.so.1
#3  0x4000000000275df0:0 in StreamConnection+0x40 ()
#4  0x40000000002e7da0:0 in ConnCreate+0x80 ()
#5  0x40000000002e6530:0 in ServerLoop+0x3b0 ()
#6  0x40000000002e5740:0 in PostmasterMain+0x1300 ()
#7  0x4000000000279800:0 in main+0x520 ()
(gdb) p debug_query_string
$1 = 0
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) Detaching from program:
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/postmaster,process 28775 

$which postmaster
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql

$gdb `which psql`
HP gdb 5.0 for HP Itanium (32 or 64 bit) and target HP-UX 11.2x.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.0 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
(gdb) attach 26752
Attaching to program: /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql, process 26752
Reading symbols from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3...done.
Reading symbols from /usr/lib/hpux64/libxnet.so.1...done.
Reading symbols from /usr/lib/hpux64/libc.so.1...done.
Reading symbols from /usr/lib/hpux64/libgen.so.1...done.
Reading symbols from /usr/lib/hpux64/libdl.so.1...done.
Reading symbols from /usr/lib/hpux64/libnsl.so.1...done.
Reading symbols from /usr/lib/hpux64/libm.so.1...done.
Reading symbols from /usr/lib/hpux64/libxti.so.1...done.
0xc000000000301e70:0 in _poll_sys+0x30 () from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc000000000301e70:0 in _poll_sys+0x30 () from /usr/lib/hpux64/libc.so.1
#1  0xc000000000313110:0 in poll+0x150 () from /usr/lib/hpux64/libc.so.1
#2  0xc00000002d36ee70:0 in pqSocketPoll+0x110 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#3  0xc00000002d36ec40:0 in pqSocketCheck+0x80 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#4  0xc00000002d36eac0:0 in pqWaitTimed+0x40 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#5  0xc00000002d362f60:0 in connectDBComplete+0xe0 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#6  0xc00000002d362500:0 in PQsetdbLogin+0x410 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#7  0x40000000000218f0:0 in main+0x510 ()
(gdb) p debug_query_string
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) Detaching from program:
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql,process 26752 

$gdb `which psql`
HP gdb 5.0 for HP Itanium (32 or 64 bit) and target HP-UX 11.2x.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.0 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
(gdb) attach 29080
Attaching to program: /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql, process 29080
Reading symbols from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3...done.
Reading symbols from /usr/lib/hpux64/libxnet.so.1...done.
Reading symbols from /usr/lib/hpux64/libc.so.1...done.
Reading symbols from /usr/lib/hpux64/libgen.so.1...done.
Reading symbols from /usr/lib/hpux64/libdl.so.1...done.
Reading symbols from /usr/lib/hpux64/libnsl.so.1...done.
Reading symbols from /usr/lib/hpux64/libm.so.1...done.
Reading symbols from /usr/lib/hpux64/libxti.so.1...done.
0xc000000000301e70:0 in _poll_sys+0x30 () from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc000000000301e70:0 in _poll_sys+0x30 () from /usr/lib/hpux64/libc.so.1
#1  0xc000000000313110:0 in poll+0x150 () from /usr/lib/hpux64/libc.so.1
#2  0xc00000002d36ee70:0 in pqSocketPoll+0x110 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#3  0xc00000002d36ec40:0 in pqSocketCheck+0x80 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#4  0xc00000002d36eac0:0 in pqWaitTimed+0x40 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#5  0xc00000002d362f60:0 in connectDBComplete+0xe0 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#6  0xc00000002d362500:0 in PQsetdbLogin+0x410 ()
   from /opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/lib/libpq.so.3
#7  0x40000000000218f0:0 in main+0x510 ()
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) Detaching from program:
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql,process 29080 

$uname -a
HP-UX ... B.11.23 ... ia64 ...

$file `which postmaster`
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/postmaster:   ELF-64 executable object file - IA64

$file `which psql`
                   
/opt/pgsql/installs/postgresql-7.4.6-gcc3.3.2-B.11.23/bin/psql: ELF-64 executable object file - IA64

$psql -V
psql (PostgreSQL) 7.4.6

$postmaster -V
postmaster (PostgreSQL) 7.4.6

This is the tail end of the server log, showing nothing
was ever logged for the hung connections...

2005-02-18 14:25:45.558 [20313] LOG:  connection received: host=10.0.1.80 port=45976
2005-02-18 14:25:45.921 [20313] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:25:46.394 [20313] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:25:46.395 [20313] LOG:  duration: 0.862 ms
2005-02-18 14:25:46.807 [20313] LOG:  statement: select version()
2005-02-18 14:25:46.808 [20313] LOG:  duration: 0.646 ms
2005-02-18 14:26:47.092 [20818] LOG:  connection received: host=10.0.1.80 port=45993
2005-02-18 14:26:47.278 [20818] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:26:47.461 [20818] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:26:47.462 [20818] LOG:  duration: 0.792 ms
2005-02-18 14:26:47.696 [20818] LOG:  statement: select version()
2005-02-18 14:26:47.696 [20818] LOG:  duration: 0.557 ms
2005-02-18 14:27:47.993 [21220] LOG:  connection received: host=10.0.1.80 port=46015
2005-02-18 14:27:48.192 [21220] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:27:48.384 [21220] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:27:48.385 [21220] LOG:  duration: 0.961 ms
2005-02-18 14:27:48.560 [21220] LOG:  statement: select version()
2005-02-18 14:27:48.560 [21220] LOG:  duration: 0.545 ms
2005-02-18 14:28:48.826 [21702] LOG:  connection received: host=10.0.1.80 port=46035
2005-02-18 14:28:49.087 [21702] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:28:49.318 [21702] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:28:49.319 [21702] LOG:  duration: 0.809 ms
2005-02-18 14:28:49.516 [21702] LOG:  statement: select version()
2005-02-18 14:28:49.516 [21702] LOG:  duration: 0.360 ms
2005-02-18 14:29:49.717 [22047] LOG:  connection received: host=10.0.1.80 port=46060
2005-02-18 14:29:49.910 [22047] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:29:50.138 [22047] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:29:50.139 [22047] LOG:  duration: 0.831 ms
2005-02-18 14:29:50.320 [22047] LOG:  statement: select version()
2005-02-18 14:29:50.321 [22047] LOG:  duration: 0.539 ms
2005-02-18 14:30:50.527 [22359] LOG:  connection received: host=10.0.1.80 port=46087
2005-02-18 14:30:50.710 [22359] LOG:  connection authorized: user=pg database=pg
2005-02-18 14:30:50.927 [22359] LOG:  statement: begin; select getdatabaseencoding(); commit
2005-02-18 14:30:50.928 [22359] LOG:  duration: 0.855 ms
2005-02-18 14:30:51.105 [22359] LOG:  statement: select version()
2005-02-18 14:30:51.106 [22359] LOG:  duration: 0.641 ms


В списке pgsql-general по дате отправления:

Предыдущее
От: Dieter Schröder
Дата:
Сообщение: Re: PostgreSQL Replication
Следующее
От: "Ed L."
Дата:
Сообщение: Re: hung postmaster?