Обсуждение: SIGSEGV in 'select * from pg_user'
Hi,
I've found the following SISGEV while playing around with a snapshot of
September 3rd.
I did a make all (with -g); make install; rm -rf data; initdb
Here's what I've done in gdb:
[postgres@jeroenv bin]$ gdb postgres
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for
details.
GDB 4.16 (i586-unknown-linux), Copyright 1996 Free Software Foundation,
Inc...
(gdb) run -D /usr/local/pgsql/data template1
Starting program: /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data
template1
POSTGRES backend interactive interface
$Revision: 1.89 $ $Date: 1998/09/01 04:32:13 $
> select * from pg_shadow
blank
1: usename (typeid = 19, len = 32, typmod = -1, byval = f)
2: usesysid (typeid = 23, len = 4, typmod = -1, byval = t)
3: usecreatedb (typeid = 16, len = 1, typmod = -1, byval = t)
4: usetrace (typeid = 16, len = 1, typmod = -1, byval = t)
5: usesuper (typeid = 16, len = 1, typmod = -1, byval = t)
6: usecatupd (typeid = 16, len = 1, typmod = -1, byval = t)
7: passwd (typeid = 25, len = -1, typmod = -1, byval = f)
8: valuntil (typeid = 702, len = 4, typmod = -1, byval = t)
----
1: usename = "postgres" (typeid = 19, len = 32, typmod =
-1, byval = f)
2: usesysid = "203" (typeid = 23, len = 4, typmod = -1,
byval = t)
3: usecreatedb = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
4: usetrace = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
5: usesuper = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
6: usecatupd = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
8: valuntil = "Sat Jan 31 07:00:00 2037 MET" (typeid = 702,
len = 4, typmod = -1, byval = t)
----
[So far, no problems]
> select * from pg_user
blank
1: usename (typeid = 19, len = 32, typmod = -1, byval = f)
2: usesysid (typeid = 23, len = 4, typmod = -1, byval = t)
3: usecreatedb (typeid = 16, len = 1, typmod = -1, byval = t)
4: usetrace (typeid = 16, len = 1, typmod = -1, byval = t)
5: usesuper (typeid = 16, len = 1, typmod = -1, byval = t)
6: usecatupd (typeid = 16, len = 1, typmod = -1, byval = t)
7: passwd (typeid = 25, len = -1, typmod = -1, byval = f)
8: valuntil (typeid = 702, len = 4, typmod = -1, byval = t)
----
1: usename = "postgres" (typeid = 19, len = 32, typmod =
-1, byval = f)
2: usesysid = "203" (typeid = 23, len = 4, typmod = -1,
byval = t)
3: usecreatedb = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
4: usetrace = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
5: usesuper = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
6: usecatupd = "t" (typeid = 16, len = 1, typmod = -1,
byval = t)
7: passwd = "********" (typeid = 25, len = -1, typmod = -1,
byval = f)
8: valuntil = "Sat Jan 31 07:00:00 2037 MET" (typeid = 702,
len = 4, typmod = -1, byval = t)
----
Program received signal SIGSEGV, Segmentation fault.
0x400e90eb in __libc_free (mem=0x400f9740)
(gdb) bt
#0 0x400e90eb in __libc_free (mem=0x400f9740)
#1 0x81cf188 in ?? ()
As the backtrace shows no clues, I've no idea where this goes wrong.
Note that the view pg_shadow goes OK.
select version() returns:
PostgreSQL 6.4.0 on i586-pc-linux-gnu, compiled by gcc 2.8.1
Anybody know what's going wrong (and where)?
Thanks,
Jeroen van Vianen
> I've found the following SISGEV while playing around with a snapshot
> of September 3rd.
(did a fresh install with initdb)
> > select * from pg_shadow
> > select * from pg_user
> Program received signal SIGSEGV, Segmentation fault.
I see the same thing with a fresh source tree on my linux box. Is this
normal?
Also, I've been working on a (small) test case, and have at least some
indication that the problem is not solely indices. I'll send a better
documented example in a bit, but at least the following one will result
in errors on a fresh install:
CREATE TABLE onek (
unique1 int4,
unique2 int4,
two int4,
four int4,
ten int4,
twenty int4,
hundred int4,
thousand int4,
twothousand int4,
fivethous int4,
tenthous int4,
odd int4,
even int4,
stringu1 name,
stringu2 name,
string4 name
);
COPY onek FROM
'/opt/postgres/current/src/test/regress/input/../data/onek.data';
create table k1 as select unique1, unique2 from onek;
copy k1 to '/opt/postgres/current/src/test/regress/k1.data';
delete from k1;
copy k1 from '/opt/postgres/current/src/test/regress/k1.data';
CREATE INDEX k1_unique1 ON k1 USING btree(unique1 int4_ops);
CREATE INDEX k1_unique2 ON k1 USING btree(unique2 int4_ops);
ERROR: DefineIndex: k1 relation not found
If I leave out the "delete from" I don't get the errors. If I do these
steps, then do a new initdb and create k1 from the saved data file, then
I still see the error.
- Tom
I have just cvsuped the source tree and have tried some tests.
>> I've found the following SISGEV while playing around with a snapshot
>> of September 3rd.
>(did a fresh install with initdb)
>> > select * from pg_shadow
>> > select * from pg_user
>> Program received signal SIGSEGV, Segmentation fault.
>
>I see the same thing with a fresh source tree on my linux box. Is this
>normal?
I saw this too on my LinuxPPC box. In my case, just doing:
select * from pg_user
crashes the backend. The backtrace shows it crashed in chunk_free()
while committing the transaction. I guess something messed up the
tables managed by malloc().
Talking about the regression, two tests (constraints, select_views)
produced core dump. Seems no difference even after applying Bruce's
latest patches.
--
Tatsauo Ishii
t-ishii@sra.co.jp
> I saw this too on my LinuxPPC box. In my case, just doing:
> select * from pg_user
>
> crashes the backend. The backtrace shows it crashed in chunk_free()
> while committing the transaction. I guess something messed up the
> tables managed by malloc().
>
> Talking about the regression, two tests (constraints, select_views)
> produced core dump. Seems no difference even after applying Bruce's
> latest patches.
I see the same behavior, with a simple "select * from pg_user" enough to
crash the backend, and with the same two regression tests resulting in
core dumps. As I've mentioned earlier, I believe that the select_views
test has been failing for quite a while, where the other problems are
more recent. Presumably the pg_user problem is similar to the
select_views problem??
Would it help to choose a (simple) test case which shows a problem
(either a core dump or the "relation not found" problem) and start
working it through together? We could then exchange notes on what we are
finding.
I'm not absolutely certain that the problems are directly related to
changes for indexing; other changes (the oid removal, the "name" type
changes, others??) happened in the same time frame...
- Tom
> I see the same behavior, with a simple "select * from pg_user" enough
> to crash the backend
The segfault is coming from a call to free() after the command has
executed and while the "CommitTransactionCommand" phase is running.
Putting the query inside a begin/end block does not help. Presumably
there is a bad pointer or something getting free'd twice.
Any ideas?
- Tom