Обсуждение: "create implicit sequence" crashes postgres
Hey all, I've been banging my head against a brick wall on this for ages, and getting nowhere. We're trying to move off a single VM running postgres 8.1 to a newer more powerful pair of VMs with a DRBD mirror for storage and postgres 8.2. But, every attempt to dump the data from 8.1.x into 8.2.6 fails, and fails exactly the same way. This SQL: CREATE TABLE billing_transaction_items ( id serial NOT NULL, billing_transaction_id integer NOT NULL, pricing_term_id integer, product_id integer NOT NULL, is_free boolean NOT NULL, is_visible boolean NOT NULL, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL, cost numeric(15,4) DEFAULT 0.0 NOT NULL, tax numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_cost numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_tax numeric(15,4) DEFAULT 0.0 NOT NULL ); And any SQL to create a table with more than 24, or 25 (can't remember off top of my head) characters long, with a serial column, causes this crash: Feb 4 17:00:03 mcsql1 postgres[6639]: [23-1] NOTICE: CREATE TABLE will create implicit sequence "billing_transaction_items_id_seq" for serial column Feb 4 17:00:03 mcsql1 postgres[6639]: [23-2] "billing_transaction_items.id" Feb 4 17:00:03 mcsql1 postgres[6377]: [4-1] LOG: server process (PID 6639) was terminated by signal 11 Feb 4 17:00:03 mcsql1 postgres[6377]: [5-1] LOG: terminating any other active server processes Feb 4 17:00:03 mcsql1 kernel: postmaster[6639] general protection rip:5555557e7fed rsp:7ffffb246110 error:0 Feb 4 17:00:03 mcsql1 postgres[6377]: [6-1] LOG: all server processes terminated; reinitializing Feb 4 17:00:03 mcsql1 postgres[6640]: [7-1] LOG: database system was interrupted at 2008-02-04 16:59:54 GMT Feb 4 17:00:03 mcsql1 postgres[6640]: [8-1] LOG: checkpoint record is at 0/4CC150 Feb 4 17:00:03 mcsql1 postgres[6640]: [9-1] LOG: redo record is at 0/4CC150; undo record is at 0/0; shutdown TRUE Feb 4 17:00:03 mcsql1 postgres[6640]: [10-1] LOG: next transaction ID: 0/630; next OID: 24576 Feb 4 17:00:03 mcsql1 postgres[6640]: [11-1] LOG: next MultiXactId: 1; next MultiXactOffset: 0 Feb 4 17:00:03 mcsql1 postgres[6640]: [12-1] LOG: database system was not properly shut down; automatic recovery in progress Feb 4 17:00:03 mcsql1 postgres[6640]: [13-1] LOG: redo starts at 0/4CC1A0 Feb 4 17:00:03 mcsql1 postgres[6640]: [14-1] LOG: record with zero length at 0/4F22C0 Feb 4 17:00:03 mcsql1 postgres[6640]: [15-1] LOG: redo done at 0/4F2280 Feb 4 17:00:03 mcsql1 postgres[6640]: [16-1] LOG: database system is ready mcsql1 is fully 64bit. My 32bit desktop can create the table fine, another 64bit server can also create the table fine. Create a completely fresh postgres database *not* on DRBD on mcsql1, and it crashes! reiserfs, ext3, no difference. The only comonality between those that crash is that they're all 64bit Xen VMs, so far non-VM servers don't crash. But nothing anywhere I can find even vaguely hints at problems running Postgres under Xen. However, the 8.1 server was also a Xen VM, although a 32bit install booted with a 64bit kernel, enabled someone to create these tables with long names. Am I being throughly dense and missing the obvious? Am I'm doing something stupid and un-supported? Or have I found a bug, perhaps? Thanks -- Mike Williams
Mike Williams <mike.williams@comodo.com> writes: > We're trying to move off a single VM running postgres 8.1 to a newer more > powerful pair of VMs with a DRBD mirror for storage and postgres 8.2. > But, every attempt to dump the data from 8.1.x into 8.2.6 fails, and fails > exactly the same way. > And any SQL to create a table with more than 24, or 25 (can't remember off top > of my head) characters long, with a serial column, causes this crash: > Feb 4 17:00:03 mcsql1 postgres[6639]: [23-1] NOTICE: CREATE TABLE will > create implicit sequence "billing_transaction_items_id_seq" for serial column > Feb 4 17:00:03 mcsql1 postgres[6639]: [23-2] "billing_transaction_items.id" > Feb 4 17:00:03 mcsql1 postgres[6377]: [4-1] LOG: server process (PID 6639) > was terminated by signal 11 > The only comonality between those that crash is that they're all 64bit Xen > VMs, so far non-VM servers don't crash. But nothing anywhere I can find even > vaguely hints at problems running Postgres under Xen. News to me too. Can you get a stack trace from the crash? Can you extract a small test case from your dump script, so people can try it elsewhere? regards, tom lane
On Monday 04 February 2008 19:31:12 Tom Lane wrote: > News to me too. Can you get a stack trace from the crash? Can you > extract a small test case from your dump script, so people can try it > elsewhere? Damn, figured I was the only one :( I've never actually used gdb before, so this is probably completely wrong. After recompiling postgresql to get this: # ls -lh /usr/bin/postgres /usr/bin/psql -rwxr-xr-x 1 root root 63M Feb 4 22:01 /usr/bin/postgres -rwxr-xr-x 1 root root 2.7M Feb 4 22:02 /usr/bin/psql In one shell I ran the following as postgres: $ postgres -D /var/lib/postgresql/data/ In another I then did this: $ gdb /usr/bin/psql GNU gdb 6.6 Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run Starting program: /usr/bin/psql Welcome to psql 8.2.6, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit postgres=# CREATE TABLE billing_transaction_items ( id serial NOT NULL, billing_transaction_id integer NOT NULL, pricing_term_id integer, product_id integer NOT NULL, is_free boolean NOT NULL, is_visible boolean NOT NULL, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL, cost numeric(15,4) DEFAULT 0.0 NOT NULL, tax numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_cost numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_tax numeric(15,4) DEFAULT 0.0 NOT NULL ); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> Program received signal SIGINT, Interrupt. 0x00002ac94da51950 in ?? () (gdb) bt #0 0x00002ac94da51950 in ?? () #1 0x00002ac94d4f2810 in ?? () #2 0x0000000000000000 in ?? () (gdb) Quit There is no other output, and no core dump is ever left, even after "ulimit -c unlimited", as shown in the log the master process doesn't crash instead it starts an immediate recovery. If there is a way to get a proper stack trace I can do that, the server isn't in use yet due to this problem. I have however found a "fix". We use hardened Gentoo everywhere, on all the 64bit Xen VMs and 64bit proper server (that does not crash!), but not the 8.1 VM as that is a 32bit Centos install thrown together as a dev environment. Compiling postgresql non-hardened allows me to create any table I like, and throw a couple hundred meg of data and schema at it in one go. The 64bit server does all the compiling for all the VMs (including the non-hardened postgresql build), but differs from the VMs in one probably inconsequencial way, it runs a GRSec enabled kernel but the VMs dont (due to the Xen and GRSec patches conflicting at the time of build). -- Mike Williams System Administration Manager - Comodo Office Tel Europe: +44 (0) 161 8747070 Fax Europe: +44 (0) 161 8771767
Mike Williams wrote: > I have however found a "fix". We use hardened Gentoo everywhere, on all the > 64bit Xen VMs and 64bit proper server (that does not crash!), but not the 8.1 > VM as that is a 32bit Centos install thrown together as a dev environment. > Compiling postgresql non-hardened allows me to create any table I like, and > throw a couple hundred meg of data and schema at it in one go. I guess it would still be good to know why it crashes. I think the reason you're not getting a good backtrace is that you're attempting to get it from the client process, which is not the guilty one here. Same for the ulimit setting: you have to apply it to the session running postmaster, not psql. The core file should be located in your "data" directory. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tuesday 05 February 2008 11:37:59 Alvaro Herrera wrote: > I guess it would still be good to know why it crashes. I think the > reason you're not getting a good backtrace is that you're attempting to > get it from the client process, which is not the guilty one here. Same > for the ulimit setting: you have to apply it to the session running > postmaster, not psql. The core file should be located in your "data" > directory. Ahh, so there is a core dump! postgres@mcsql1 ~ $ !ul ulimit -c unlimited postgres@mcsql1 ~ $ !po postgres -D /var/lib/postgresql/data/ postgres@mcsql1 ~ $ psql Welcome to psql 8.2.6, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit postgres=# CREATE TABLE billing_transaction_items ( id serial NOT NULL, billing_transaction_id integer NOT NULL, pricing_term_id integer, product_id integer NOT NULL, is_free boolean NOT NULL, is_visible boolean NOT NULL, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL, cost numeric(15,4) DEFAULT 0.0 NOT NULL, tax numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_cost numeric(15,4) DEFAULT 0.0 NOT NULL, quoted_tax numeric(15,4) DEFAULT 0.0 NOT NULL ); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> \q postgres@mcsql1 ~ $ gdb /usr/bin/postgres /var/lib/postgresql/data/core GNU gdb 6.6 Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1". Reading symbols from /lib64/libpam.so.0...done. Loaded symbols for /lib/libpam.so.0 Reading symbols from /usr/lib64/libssl.so.0.9.8...done. Loaded symbols for /usr/lib64/libssl.so.0.9.8 Reading symbols from /usr/lib64/libcrypto.so.0.9.8...done. Loaded symbols for /usr/lib64/libcrypto.so.0.9.8 Reading symbols from /lib64/libcrypt.so.1...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib/libm.so.6 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libnss_compat.so.2...done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib64/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib64/libnss_nis.so.2...done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /lib64/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 Core was generated by `postgres: postgres postgres [local] CR'. Program terminated with signal 11, Segmentation fault. #0 0x00000000006292bf in pfree (pointer=0x9bba00) at mcxt.c:585 585 mcxt.c: No such file or directory. in mcxt.c (gdb) (gdb) (gdb) bt #0 0x00000000006292bf in pfree (pointer=0x9bba00) at mcxt.c:585 #1 0x00000000006115b4 in EmitErrorReport () at elog.c:1746 #2 0x0000000000610b3e in errfinish (dummy=0) at elog.c:390 #3 0x00000000004a4ddc in transformColumnDefinition (pstate=0xa75978, cxt=0x7fff06ee5b50, column=0xa4d190) at analyze.c:1098 #4 0x00000000004a6180 in transformStmt (pstate=0xa757d8, parseTree=0xa75738, extras_before=0x7fff06ee5c00, extras_after=0x7fff06ee5c08)at analyze.c:970 #5 0x00000000004a7a7a in do_parse_analyze (parseTree=0x0, pstate=0xa757d8) at analyze.c:251 #6 0x00000000004a8544 in parse_analyze (parseTree=0xa75738, sourceText=0xa4c6d0 "CREATE TABLE billing_transaction_items (\n id serial NOT NULL,\n billing_transaction_id integerNOT NULL,\n pricing_term_id integer,\n product_id integer NOT NULL,\n is_free boolean NOT NULL"..., paramTypes=0x0,numParams=0) at analyze.c:173 #7 0x0000000000592f2b in pg_analyze_and_rewrite (parsetree=0xa75738, query_string=0xa4c6d0 "CREATE TABLE billing_transaction_items (\n id serial NOT NULL,\n billing_transaction_idinteger NOT NULL,\n pricing_term_id integer,\n product_id integer NOT NULL,\n is_free booleanNOT NULL"..., paramTypes=0x0, numParams=0) at postgres.c:567 #8 0x000000000059322d in exec_simple_query ( query_string=0xa4c6d0 "CREATE TABLE billing_transaction_items (\n id serial NOT NULL,\n billing_transaction_idinteger NOT NULL,\n pricing_term_id integer,\n product_id integer NOT NULL,\n is_free booleanNOT NULL"...) at postgres.c:875 #9 0x0000000000594bec in PostgresMain (argc=4, argv=0x9bb0c0, username=0x9bb080 "postgres") at postgres.c:3424 #10 0x000000000056c058 in ServerLoop () at postmaster.c:2932 #11 0x000000000056dc26 in PostmasterMain (argc=3, argv=0x9b69e0) at postmaster.c:966 #12 0x0000000000529133 in main (argc=3, argv=0x9b69e0) at main.c:188 (gdb) Means nothing to me... -- Mike Williams System Administration Manager - Comodo Office Tel Europe: +44 (0) 161 8747070 Fax Europe: +44 (0) 161 8771767
On Tuesday 05 February 2008 11:04:07 Mike Williams wrote: > Compiling postgresql non-hardened allows me to create any table I like, and > throw a couple hundred meg of data and schema at it in one go. Seems to be a problem specific to stack smashing protection. If I compile with -fno-stack-protector I can't make it crash, but psql loses it's command history (both recall and saving). # find /etc/portage/env/ -type f -print -exec cat "{}" \; /etc/portage/env/dev-db/postgresql source /etc/make.conf CFLAGS="${CFLAGS} -fno-stack-protector" -- Mike Williams System Administration Manager - Comodo Office Tel Europe: +44 (0) 161 8747070 Fax Europe: +44 (0) 161 8771767
Mike Williams <mike.williams@comodo.com> writes: > On Tuesday 05 February 2008 11:04:07 Mike Williams wrote: >> Compiling postgresql non-hardened allows me to create any table I like, and >> throw a couple hundred meg of data and schema at it in one go. > Seems to be a problem specific to stack smashing protection. At this point I think you're dealing with a Gentoo-specific compiler bug. You need to report it to the appropriate channels in that project. You probably don't really want to hear this, but a lot of people around here consider Gentoo a poor choice for running servers. That distro's emphasis is on bleeding edge features/performance, which is an admirable goal, but you pay a price in stability. regards, tom lane