Обсуждение: Vacuum analyze problem

Поиск
Список
Период
Сортировка

Vacuum analyze problem

От
John Hatfield
Дата:
Hi,

My problem is when running the vacuum with analyze an error occurs but it runs ok without the analyse.

From psql
gsmain_test=# vacuum;
VACUUM
gsmain_test=# vacuum analyze;
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#

From the shell
bash-2.04$ vacuumdb -d gsmain_test
VACUUM
bash-2.04$ vacuumdb -d gsmain_test -z
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
connection to server was lost
vacuumdb: vacuum failed
bash-2.04$

Linux Box
Postgresql 7.0.3-2 installed from rpms
Redhat 7.0

Any help is greatly appreciated

Regards,

John


Re: Vacuum analyze problem

От
Tom Lane
Дата:
John Hatfield <jhatfield@g-s.com.au> writes:
> My problem is when running the vacuum with analyze an error occurs but
> it runs ok without the analyse.

Try "vacuum verbose analyze" so you can see which table it's failing on
(or, just look in the postmaster log to see which table is mentioned
last).  There's probably a core file left from the crashed backend;
can you get a stack backtrace from it with gdb?

            regards, tom lane

RE: Vacuum analyze problem

От
John Hatfield
Дата:
The last bit of vacuum verbose analyse

NOTICE:  --Relation pg_ipl--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/
0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEm
pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  --Relation pg_inheritproc--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/
0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEm
pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  --Relation pg_rewrite--
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#

Yes there is a core file in the dir with the database files

bash-2.04$ ls -l core
-rw-------    1 postgres postgres  2600960 Mar  2 10:02 core
bash-2.04$ date
Fri Mar  2 10:02:32 EST 2001
bash-2.04$ gdb core
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"..."/var/lib/pgsql/data/base/gsmai
n_test/core": not in executable format: File format not recognized

(gdb) where
No stack.
(gdb)

gdb does not seem to like the core file format though

Regards,

John

-----Original Message-----
From:    Tom Lane
Sent:    Friday, 2 March 2001 11:50 AM
To:    jhatfield@g-s.com.au
Cc:    'PostgreSQL Admin'
Subject:    Re: [ADMIN] Vacuum analyze problem

John Hatfield <jhatfield@g-s.com.au> writes:
> My problem is when running the vacuum with analyze an error occurs but
> it runs ok without the analyse.

Try "vacuum verbose analyze" so you can see which table it's failing on
(or, just look in the postmaster log to see which table is mentioned
last).  There's probably a core file left from the crashed backend;
can you get a stack backtrace from it with gdb?

            regards, tom lane


Re: Vacuum analyze problem

От
Tom Lane
Дата:
John Hatfield <jhatfield@g-s.com.au> writes:
> The last bit of vacuum verbose analyse
> NOTICE:  --Relation pg_rewrite--
> pqReadData() -- backend closed the channel unexpectedly.

OK, so pg_rewrite seems to be broken.  Not good...

> This GDB was configured as "i386-redhat-linux"..."/var/lib/pgsql/data/base/gsmai
> n_test/core": not in executable format: File format not recognized

You need to do "gdb /path/to/postgres/executable core".

Or try it like this instead:

    * fire up psql in one window
    * determine PID of backend connected to psql
    * attach to live backend process with gdb:
        gdb /path/to/postgres/executable
        attach PID
        cont
    * issue vacuum command to psql

gdb should catch the crash and then you can issue "bt".

            regards, tom lane

RE: Vacuum analyze problem

От
John Hatfield
Дата:
Gdb output

bash-2.04$ gdb /usr/bin/postgres core
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(no debugging symbols found)...

warning: core file may not match specified executable file.
Core was generated by `/usr/bin/postgres localhos'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /usr/lib/libreadline.so.4.1...done.
Loaded symbols for /usr/lib/libreadline.so.4.1
Reading symbols from /lib/libtermcap.so.2...done.
Loaded symbols for /lib/libtermcap.so.2
Reading symbols from /usr/lib/libncurses.so.5...done.
Loaded symbols for /usr/lib/libncurses.so.5
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
#0  strcoll () at strcoll.c:248
248     strcoll.c: No such file or directory.
(gdb) where
#0  strcoll () at strcoll.c:248
#1  0x8115b1f in lztext_cmp ()
#2  0x8115b72 in lztext_eq ()
#3  0x8098809 in vacuum ()
#4  0x8096120 in vacuum ()
#5  0x8095705 in vacuum ()
#6  0x8094e9d in vacuum ()
#7  0x8094e21 in vacuum ()
#8  0x81000b3 in ProcessUtility ()
#9  0x80fd42c in pg_exec_query_dest ()
#10 0x80fd378 in pg_plan_query ()
#11 0x80fe465 in PostgresMain ()
#12 0x80e5b7b in PostmasterMain ()
#13 0x80e571c in PostmasterMain ()
#14 0x80e4889 in PostmasterMain ()
#15 0x80e420c in PostmasterMain ()
#16 0x80b571d in main ()
#17 0x40112b65 in __libc_start_main (main=0x80b5670 <main>, argc=5,
    ubp_av=0xbffffd34, init=0x80648b8 <_init>, fini=0x814632c <_fini>,
    rtld_fini=0x4000df24 <_dl_fini>, stack_end=0xbffffd2c)
    at ../sysdeps/generic/libc-start.c:111

Regards


John
-----Original Message-----
From:    Tom Lane
Sent:    Friday, 2 March 2001 12:34 PM
To:    jhatfield@g-s.com.au
Cc:    'PostgreSQL Admin'
Subject:    Re: [ADMIN] Vacuum analyze problem

John Hatfield <jhatfield@g-s.com.au> writes:
> The last bit of vacuum verbose analyse
> NOTICE:  --Relation pg_rewrite--
> pqReadData() -- backend closed the channel unexpectedly.

OK, so pg_rewrite seems to be broken.  Not good...

> This GDB was configured as "i386-redhat-linux"..."/var/lib/pgsql/data/base/gsmai
> n_test/core": not in executable format: File format not recognized

You need to do "gdb /path/to/postgres/executable core".

Or try it like this instead:

    * fire up psql in one window
    * determine PID of backend connected to psql
    * attach to live backend process with gdb:
        gdb /path/to/postgres/executable
        attach PID
        cont
    * issue vacuum command to psql

gdb should catch the crash and then you can issue "bt".

            regards, tom lane


Re: Vacuum analyze problem

От
Tom Lane
Дата:
John Hatfield <jhatfield@g-s.com.au> writes:
> (gdb) where
> #0  strcoll () at strcoll.c:248
> #1  0x8115b1f in lztext_cmp ()
> #2  0x8115b72 in lztext_eq ()
> #3  0x8098809 in vacuum ()

Hm.  I suspect this is the same problem that a couple of other people
reported recently: crashes inside strcoll(), even though the strings
being passed to it are perfectly OK.

[ digs in archives... ] Dave Cramer reported a similar crash on
2001-01-24, and Jukka Honkela reported one on 2001-01-01.  Dave was
also using 7.0.3, and Jukka current-as-of-then development sources.
Oh, this is interesting: all three of you are running RedHat 7.0!

I had originally assumed that Postgres was somehow clobbering strcoll's
internal locale information structures, but given that the problem is
only being reported on RH 7.0, a bug in strcoll itself is starting to
look like a plausible idea too.

Would one or another of you crank up ye olde debugger and try to figure
this out?  If it is Postgres' fault, I want to fix it ... but it's hard
to do much when I can't reproduce the problem ...

            regards, tom lane