Обсуждение: Duplicate primary keys/rows
This is weird. I set up a table with a serial id field and created a primary key on it. Then I imported data. Running an app against it, I got periodic errors stating "duplicate key violates unique constraint "pkey_table1." Looking through the table (with phppgadmin), there are duplicate rows: id|f1|f2|f3|f4 585|c|a|e|f 586|a|b|c|d 586|a|b|c|d 587|g|e|r|z However: select * from table1 where id=586; 586|a|b|c|d Yet: select * from table1 where id>=585 and id<=587; 585|c|a|e|f 586|a|b|c|d 586|a|b|c|d 587|g|e|r|z Wow, how is this possible? I'm using PG 8.0.3 on Windows XP. This computer has been crashing repeatedly lately, if that could be blamed (bad memory? hard disk? I haven't quite figured out why.) Using phppgadmin, I was able to delete one of the duplicate rows (there are several) - don't know how it does that - maybe using OIDs? CSN __________________________________ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/
On Sun, Oct 09, 2005 at 12:46:51PM -0700, CSN wrote: > select * from table1 where id=586; > 586|a|b|c|d Do you get different results from the following queries? SET enable_seqscan TO on; SET enable_indexscan TO off; SELECT * FROM table1 WHERE id = 586; SET enable_seqscan TO off; SET enable_indexscan TO on; SELECT * FROM table1 WHERE id = 586; > Yet: > select * from table1 where id>=585 and id<=587; > 585|c|a|e|f > 586|a|b|c|d > 586|a|b|c|d > 587|g|e|r|z What's the output of the following query? RESET enable_seqscan; RESET enable_indexscan; SELECT oid, ctid, xmin, cmin, xmax, cmax, * FROM table1 WHERE id >= 585 AND id <= 587; If you get the error 'column "oid" does not exist' then you've created the table without oids, so just omit oid from the select list: SELECT ctid, xmin, cmin, xmax, cmax, * FROM table1 WHERE id >= 585 AND id <= 587; > Wow, how is this possible? I'm using PG 8.0.3 on > Windows XP. This computer has been crashing repeatedly > lately, if that could be blamed (bad memory? hard > disk? I haven't quite figured out why.) Faulty hardware is one possibile explanation. -- Michael Fuhr
--- Michael Fuhr <mike@fuhr.org> wrote: > On Sun, Oct 09, 2005 at 12:46:51PM -0700, CSN wrote: > > select * from table1 where id=586; > > 586|a|b|c|d > > Do you get different results from the following > queries? > > SET enable_seqscan TO on; > SET enable_indexscan TO off; > SELECT * FROM table1 WHERE id = 586; This returns 2 rows. > SET enable_seqscan TO off; > SET enable_indexscan TO on; > SELECT * FROM table1 WHERE id = 586; This returns 1 row. > > Yet: > > select * from table1 where id>=585 and id<=587; > > 585|c|a|e|f > > 586|a|b|c|d > > 586|a|b|c|d > > 587|g|e|r|z > > What's the output of the following query? > > RESET enable_seqscan; > RESET enable_indexscan; > > SELECT oid, ctid, xmin, cmin, xmax, cmax, * > FROM table1 > WHERE id >= 585 AND id <= 587; oid | ctid | xmin | cmin | xmax | cmax | id --------+-----------+---------+------+---------+------+----- 125465 | (3143,78) | 1664385 | 0 | 1664386 | 2 | 984 125466 | (2745,50) | 1481020 | 0 | 1682425 | 2 | 985 125466 | (2672,11) | 1445346 | 0 | 1481020 | 0 | 985 125467 | (3159,28) | 1671875 | 0 | 1671876 | 2 | 986 (I'm using a different duplicate row - 985. I deleted 586's duplicate.) Is this a problem with the index? Would rebuilding them fix this problem? I'm still curious why this happened, and somewhat troubled that something like this can happen. Thanks for your help, CSN > If you get the error 'column "oid" does not exist' > then you've > created the table without oids, so just omit oid > from the select > list: > > SELECT ctid, xmin, cmin, xmax, cmax, * > FROM table1 > WHERE id >= 585 AND id <= 587; > > > Wow, how is this possible? I'm using PG 8.0.3 on > > Windows XP. This computer has been crashing > repeatedly > > lately, if that could be blamed (bad memory? hard > > disk? I haven't quite figured out why.) > > Faulty hardware is one possibile explanation. > > -- > Michael Fuhr > __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com
CSN <cool_screen_name90001@yahoo.com> writes: > oid | ctid | xmin | cmin | xmax | cmax | id > --------+-----------+---------+------+---------+------+----- > 125466 | (2672,11) | 1445346 | 0 | 1481020 | 0 | 985 > 125466 | (2745,50) | 1481020 | 0 | 1682425 | 2 | 985 Hmm. The fact that the dup rows have the same OID indicates pretty strongly that they are actually two versions of the same row, and not two independently inserted rows. Furthermore we can see that xact 1481020 deleted the first version and inserted the second (note I took the liberty of rearranging your output to make the rows appear in chronological order). So the index hasn't screwed up, exactly; the problem is that both rows appear as good at the same time. But why? It's really highly annoying that we can't see the contents of the infomasks for the rows. Would you be willing to grab a copy of pg_filedump and dump out these two data pages so we can see the complete tuple headers? (If you don't have a compiler then you'd need to find a precompiled copy of pg_filedump for Windows. I don't know if anyone's made one available.) Given that you say the machine has been crashing, my bet is that a crash caused the loss of pg_clog status for xid 1481020 at a time when 2745,50's xmin had been marked committed good, but 2672,11's xmax had not been similarly marked. We have sufficient defenses against this sort of thing *if the disk drive does not lie about write complete*. (Unfortunately the vast majority of el-cheapo PCs are configured to lie with abandon, which means that we can't guarantee data consistency across power failures on such hardware.) It'd be nice to get direct confirmation of that theory though. regards, tom lane
I don't have a compiler on this machine. If somebody can point me to a copy of pg_filedump for Windows (I didn't see any using Google) I'd be happy to use it. Or perhaps I could compile it under cygwin. The hard drive is a Western Digital 200GB JD (SATA), if that can be used to determine how badly it lies. ;) Thanks, CSN --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > CSN <cool_screen_name90001@yahoo.com> writes: > > oid | ctid | xmin | cmin | xmax | > cmax | id > > > --------+-----------+---------+------+---------+------+----- > > 125466 | (2672,11) | 1445346 | 0 | 1481020 | > 0 | 985 > > 125466 | (2745,50) | 1481020 | 0 | 1682425 | > 2 | 985 > > Hmm. The fact that the dup rows have the same OID > indicates pretty > strongly that they are actually two versions of the > same row, and > not two independently inserted rows. Furthermore we > can see that xact > 1481020 deleted the first version and inserted the > second (note I took > the liberty of rearranging your output to make the > rows appear in > chronological order). > > So the index hasn't screwed up, exactly; the problem > is that both rows > appear as good at the same time. But why? > > It's really highly annoying that we can't see the > contents of the > infomasks for the rows. Would you be willing to > grab a copy of > pg_filedump and dump out these two data pages so we > can see the > complete tuple headers? > > (If you don't have a compiler then you'd need to > find a precompiled > copy of pg_filedump for Windows. I don't know if > anyone's made one > available.) > > Given that you say the machine has been crashing, my > bet is that a crash > caused the loss of pg_clog status for xid 1481020 at > a time when > 2745,50's xmin had been marked committed good, but > 2672,11's xmax had > not been similarly marked. We have sufficient > defenses against this > sort of thing *if the disk drive does not lie about > write complete*. > (Unfortunately the vast majority of el-cheapo PCs > are configured to lie > with abandon, which means that we can't guarantee > data consistency > across power failures on such hardware.) It'd be > nice to get direct > confirmation of that theory though. > > regards, tom lane > __________________________________ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/
On Mon, Oct 10, 2005 at 04:28:57PM -0400, Tom Lane wrote: > It's really highly annoying that we can't see the contents of the > infomasks for the rows. Any particular reason there isn't an infomask system column? -- Michael Fuhr
On Mon, 2005-10-10 at 15:51, CSN wrote: > I don't have a compiler on this machine. If somebody > can point me to a copy of pg_filedump for Windows (I > didn't see any using Google) I'd be happy to use it. > Or perhaps I could compile it under cygwin. > > The hard drive is a Western Digital 200GB JD (SATA), > if that can be used to determine how badly it lies. ;) The general rule is that ATA (parallel or serial) drives lie. You can turn off the write cache, which will make it slower, but then it should be reliable during a power loss. Or, put it all on a big UPS and hope the power supply never goes out.
Michael Fuhr <mike@fuhr.org> writes: > On Mon, Oct 10, 2005 at 04:28:57PM -0400, Tom Lane wrote: >> It's really highly annoying that we can't see the contents of the >> infomasks for the rows. > Any particular reason there isn't an infomask system column? (a) inertia (b) lack of desire to add an additional pg_attribute row per table. I recall having proposed that we stop storing explicit pg_attribute entries for system columns, which would make this sort of change easier to make, and would save a pretty considerable amount of space in pg_attribute too. (In the present regression database, about 45% of the rows in pg_attribute are for system columns; that might be overly high for real-world DBs though.) But people were a bit worried about what might break. regards, tom lane
On Mon, Oct 10, 2005 at 05:28:17PM -0400, Tom Lane wrote: > I recall having proposed that we stop storing explicit pg_attribute > entries for system columns, which would make this sort of change easier > to make, and would save a pretty considerable amount of space in > pg_attribute too. (In the present regression database, about 45% of the > rows in pg_attribute are for system columns; that might be overly high > for real-world DBs though.) But people were a bit worried about what > might break. In catalog/heap.c there already is a SystemAttributeByName() to do the legwork. Seems to me all you'd need to do is check just as you're about to fail on "attribute not found". If you actually look at scanRTEForColumn() in parser/parse_relation.c it actually checks to see if a column name could be a system column name, *before* looking it up in the catalog. Remove the catalog test (except for OID obviously) and it'd sail right through. It'd be interesting to see what happened... Given that internally, they're referred to by number, it might not be so bad. As usual, external clients might get confused if they're not there... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
Martijn van Oosterhout <kleptog@svana.org> writes: > On Mon, Oct 10, 2005 at 05:28:17PM -0400, Tom Lane wrote: >> But people were a bit worried about what >> might break. > In catalog/heap.c there already is a SystemAttributeByName() to do the > legwork. Seems to me all you'd need to do is check just as you're about > to fail on "attribute not found". We could certainly make it work as far as the backend is concerned. The issue is whether there is any client code out there that will fail if these entries are no longer present in pg_attribute. regards, tom lane
I don't know if I'm going to get a copy of pg_filedump. What's the best way to fix this - dump then restore? CSN --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > CSN <cool_screen_name90001@yahoo.com> writes: > > oid | ctid | xmin | cmin | xmax | > cmax | id > > > --------+-----------+---------+------+---------+------+----- > > 125466 | (2672,11) | 1445346 | 0 | 1481020 | > 0 | 985 > > 125466 | (2745,50) | 1481020 | 0 | 1682425 | > 2 | 985 > > Hmm. The fact that the dup rows have the same OID > indicates pretty > strongly that they are actually two versions of the > same row, and > not two independently inserted rows. Furthermore we > can see that xact > 1481020 deleted the first version and inserted the > second (note I took > the liberty of rearranging your output to make the > rows appear in > chronological order). > > So the index hasn't screwed up, exactly; the problem > is that both rows > appear as good at the same time. But why? > > It's really highly annoying that we can't see the > contents of the > infomasks for the rows. Would you be willing to > grab a copy of > pg_filedump and dump out these two data pages so we > can see the > complete tuple headers? > > (If you don't have a compiler then you'd need to > find a precompiled > copy of pg_filedump for Windows. I don't know if > anyone's made one > available.) > > Given that you say the machine has been crashing, my > bet is that a crash > caused the loss of pg_clog status for xid 1481020 at > a time when > 2745,50's xmin had been marked committed good, but > 2672,11's xmax had > not been similarly marked. We have sufficient > defenses against this > sort of thing *if the disk drive does not lie about > write complete*. > (Unfortunately the vast majority of el-cheapo PCs > are configured to lie > with abandon, which means that we can't guarantee > data consistency > across power failures on such hardware.) It'd be > nice to get direct > confirmation of that theory though. > > regards, tom lane > __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com