Обсуждение: int2 unique index malfunction (btree corrupt)
Your name : Christof Petig Your email address : christof.petig@wtal.de System Configuration --------------------- Architecture (example: Intel Pentium) : Intel dual-Pentium II Operating System (example: Linux 2.0.26 ELF) : Linux 2.2.11 ELF PostgreSQL version (example: PostgreSQL-6.5.1): PostgreSQL-6.5, 6.5.1, 6.6 Compiler used (example: gcc 2.8.0) : egcs-1.1.1 Dear PostgreSQL wizards, During development of a CIM program I frequently updated a table by its primary key (int2 or numeric(3)). A lot of strange messages 'NOTICE: (transaction aborted): queries ignored until END' alerted me that something is going wrong. update foo set val=val+1 where key=:n generated 'Cannot insert a duplicate key into a unique index' which is definitely wrong. Included is a very small (C++) program (Linux, egcs-1.1.1) which generates the mentioned errors. If it helps debugging, I might port it to plain C (contact me). It occured with V6.5 as well as with August 2nd's CVS tree. Sometimes even the initial 'insert's fail, one time (with August 2nd's CVS tree) I got an 'FATAL 1: btree: items are out of order (leftmost 0, stack 10, update 1)' error. Now with 6.5.1, I don't get the NOTICE messages but straight an FATAL 1: btree: items are out of order (leftmost 0, stack 3, update 1) FATAL 1: btree: items are out of order (leftmost 0, stack 2, update 1) It seems there is something wrong with PostgreSQL's internals. Perhaps this program helps tracking it. Best Regards, Christof PS: vacuum after the FATAL (6.6): DEBUG: --Relation test-- DEBUG: Pages 13: Changed 2, Reapped 13, Empty 0, New 0; Tup 128: Vac 1823, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 48, MaxLen 48; Re-using: Free/Avail. Space 92444/86784; EndEmpty/Avail. Pages 0/12. Elapsed 0/0 sec. DEBUG: Index test_pkey: Pages 13; Tuples 128: Deleted 1819. Elapsed 0/0 sec. DEBUG: Rel test: Pages: 13 --> 1; Tuple(s) moved: 128. Elapsed 0/0 sec. DEBUG: Index test_pkey: Pages 13; Tuples 128: Deleted 128. Elapsed 0/0 sec. I haven't seen the corruption on 6.5.
Вложения
Christof Petig <christof.petig@wtal.de> writes: > During development of a CIM program I frequently updated a table by its > primary key (int2 or numeric(3)). A lot of strange messages > 'NOTICE: (transaction aborted): queries ignored until END' alerted me > that something is going wrong. > [ details snipped ] FWIW, the test program you supplied seems to run without errors for me. I'm using current CVS sources on an HPUX box. There was a fix applied on 8/8 to clean up a problem with btrees not recovering from an aborted transaction properly, but I'm not sure whether that has anything to do with your example... regards, tom lane
Tom Lane wrote: > Christof Petig <christof.petig@wtal.de> writes: > > During development of a CIM program I frequently updated a table by its > > primary key (int2 or numeric(3)). A lot of strange messages > > 'NOTICE: (transaction aborted): queries ignored until END' alerted me > > that something is going wrong. > > [ details snipped ] > > FWIW, the test program you supplied seems to run without errors for me. > I'm using current CVS sources on an HPUX box. > > There was a fix applied on 8/8 to clean up a problem with btrees not > recovering from an aborted transaction properly, but I'm not sure > whether that has anything to do with your example... My example fails desperately (within one to two seconds) on 6.5.1, however I tested it with today's CVS sources and it runs cleanly (disable the debugging output for testing at full speed). So the bugfix seems to cover my problem. However ... - if I vacuum the database while my test program runs all sorts of strange things happen: -- all goes well (90% chance, better if database had recently shrunk) -- the vacuum backend crashes: /home/christof> vacuumdb test pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. We have lost the connection to the backend, so further processing is impossible. vacuumdb: database vacuum failed on test. -- see yourself: home/christof> vacuumdb test ERROR: Cannot insert a duplicate key into a unique index vacuumdb: database vacuum failed on test. -- postmaster goes into an endless loop, you can't kill test nor vacuumdb (happened once after a long run, test (the table's file) had reached about 4MB.) Killing postmaster helps ... - vacuum never shrinks primary indices, and the index' file continues to grow (even at 7MB+). Seems the only choice for long running databases is either (drop index/create index) or (dump/delete/restore). Regards Christof PS: Besides these issues Postgres works rather well! I like datetime_part('epoch', ...) and timespan_part('epoch', ...) which cover a functionality not available on our closed source (aka commercial) database. Calculating the speed of a running machine in SQL is nearly trivial (start_time, current_time, produced_amount). PPS: I modified the test program to not drop the table and recreate it on start. This allows many runs (event concurrent) on the same tables. Simly invoke as ./test something_which_doesnt_matter
Вложения
As a side note: once test_pkey is beyond 13MB in size a failure of a vacuumdb is nearly certain. the size of a clean (!) test increases steadily (due to a bigger pkey file?) Regards Christof