Обсуждение: how could duplicate pkey exist in psql?
recently I have found several tables has exactly the same pkey, here is the definition:
"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
the data is like this:
159292 | funnypics_link_point | 41 | num
159292 | funnypics_link_point | 40 | num
I could not even update this record.
really confused about how could this happen... thanks!
Yan Chunlu wrote: > recently I have found several tables has exactly the same pkey, here is the definition: > "diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key) > > > the data is like this: > > 159292 | funnypics_link_point | 41 > | num > 159292 | funnypics_link_point | 40 > | num > > > I could not even update this record. It could be a software bug or something else, but did you ever restore the database after a storage problem? This has happened to me once: http://archives.postgresql.org/pgsql-general/2010-02/msg00971.php Yours, Laurenz Albe
On 17 November 2011 06:19, Yan Chunlu <springrider@gmail.com> wrote:
recently I have found several tables has exactly the same pkey, here is the definition:"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)the data is like this:159292 | funnypics_link_point | 41 | num159292 | funnypics_link_point | 40 | numI could not even update this record.really confused about how could this happen... thanks!
Hi,
could you send us result of the query:
select thing_id, '|'||key||'|' from table?
Maybe there are some more spaces in the key column which were hidden by table alignment in the client?
regards
Szymon
Em 17-11-2011 03:19, Yan Chunlu escreveu: > recently I have found several tables has exactly the same pkey, here > is the definition: > "diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key) > > > the data is like this: > > 159292 | funnypics_link_point | 41 > > | num > 159292 | funnypics_link_point | 40 > > | num > > > I could not even update this record. > > really confused about how could this happen... thanks! I know one scenario this can happen on Linux. In my case, it was caused by a "rsync"... instead copy to a different location, script was copying pg_xlog over own pg_xlog. I did this stupidity once, and learned for a life time. Lost two hours of work to recover everything (from backup, at least I had one). Be careful with rsync and cp, since Linux does not block files from being overwriten even when they are in use. Regards, Edson.
seems they are identical:
159292 | |funnypicscn_link_karma|
159292 | |funnypicscn_link_karma|
On Thu, Nov 17, 2011 at 4:07 PM, Szymon Guz <mabewlun@gmail.com> wrote:
Hi,On 17 November 2011 06:19, Yan Chunlu <springrider@gmail.com> wrote:recently I have found several tables has exactly the same pkey, here is the definition:"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)the data is like this:159292 | funnypics_link_point | 41 | num159292 | funnypics_link_point | 40 | numI could not even update this record.really confused about how could this happen... thanks!could you send us result of the query:select thing_id, '|'||key||'|' from table?Maybe there are some more spaces in the key column which were hidden by table alignment in the client?regardsSzymon
I am using pgpool's replication feature, it does copy pg_xlog from one server to another, was that possible cause of the problem?
thanks for the help!
On Thu, Nov 17, 2011 at 5:38 PM, Edson Richter <richter@simkorp.com.br> wrote:
Em 17-11-2011 03:19, Yan Chunlu escreveu:I know one scenario this can happen on Linux. In my case, it was caused by a "rsync"... instead copy to a different location, script was copying pg_xlog over own pg_xlog.recently I have found several tables has exactly the same pkey, here is the definition:
"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
the data is like this:
159292 | funnypics_link_point | 41 | num
159292 | funnypics_link_point | 40 | num
I could not even update this record.
really confused about how could this happen... thanks!
I did this stupidity once, and learned for a life time. Lost two hours of work to recover everything (from backup, at least I had one).
Be careful with rsync and cp, since Linux does not block files from being overwriten even when they are in use.
Regards,
Edson.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, Nov 17, 2011 at 01:19:30PM +0800, Yan Chunlu wrote: > recently I have found several tables has exactly the same pkey, here is > the definition: > "diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key) please check: select thing_id, key, count(*) from diggcontent_data_account group by 1,2 having count(*) > 1; this will show if you have really duplicated values. if you have - the index ( diggcontent_data_account_pkey ) is broken. Exact reason can vary, any chance this database is hot-backup restored from different system? Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
Em 17-11-2011 09:21, Yan Chunlu escreveu:
I did not mean that this IS your problem, I just gave you a tip regarding a problem I had in the past, that eventually has same simptom.
This scenario only happens when your script is copy data over own data... like in "rsync -ar root@127.0.0.1:/var/lib/pgsql/9.0/data/* /var/lib/pgsql/9.0/data/"
the command above is highly dangerous because it copies data over the network link over its own data... if you have transactions runing during the command above, you will get a crash (and, in my case, I had duplicate primary keys).
Would be better to check if this could be happening to you... some script overwriting data using rsync, cp, etc... I had no other situation where Postgresql allowed duplicate keys.
Hope this helps,
Edson.
I am using pgpool's replication feature, it does copy pg_xlog from one server to another, was that possible cause of the problem?
I did not mean that this IS your problem, I just gave you a tip regarding a problem I had in the past, that eventually has same simptom.
This scenario only happens when your script is copy data over own data... like in "rsync -ar root@127.0.0.1:/var/lib/pgsql/9.0/data/* /var/lib/pgsql/9.0/data/"
the command above is highly dangerous because it copies data over the network link over its own data... if you have transactions runing during the command above, you will get a crash (and, in my case, I had duplicate primary keys).
Would be better to check if this could be happening to you... some script overwriting data using rsync, cp, etc... I had no other situation where Postgresql allowed duplicate keys.
Hope this helps,
Edson.
thanks for the help!On Thu, Nov 17, 2011 at 5:38 PM, Edson Richter <richter@simkorp.com.br> wrote:
Em 17-11-2011 03:19, Yan Chunlu escreveu:I know one scenario this can happen on Linux. In my case, it was caused by a "rsync"... instead copy to a different location, script was copying pg_xlog over own pg_xlog.recently I have found several tables has exactly the same pkey, here is the definition:
"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
the data is like this:
159292 | funnypics_link_point | 41 | num
159292 | funnypics_link_point | 40 | num
I could not even update this record.
really confused about how could this happen... thanks!
I did this stupidity once, and learned for a life time. Lost two hours of work to recover everything (from backup, at least I had one).
Be careful with rsync and cp, since Linux does not block files from being overwriten even when they are in use.
Regards,
Edson.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
got it. thank you very much for you help. I found out this problem too late, and there is no backup.
luckily there was not too much data for this, and my app keeps running without error.
I am not sure if they are related but I could not use pg_restore to import data dumped by "pg_dump -Fc";
pg_restore will print some error message about "duplicate primary key", and the table is empty. no data has been imported.
pg_restore supposed to import the data and ignore the errors. does any one have the similar problem?
On Thu, Nov 17, 2011 at 11:08 PM, Edson Richter <richter@simkorp.com.br> wrote:
Em 17-11-2011 09:21, Yan Chunlu escreveu:I did not mean that this IS your problem, I just gave you a tip regarding a problem I had in the past, that eventually has same simptom.I am using pgpool's replication feature, it does copy pg_xlog from one server to another, was that possible cause of the problem?
This scenario only happens when your script is copy data over own data... like in "rsync -ar root@127.0.0.1:/var/lib/pgsql/9.0/data/* /var/lib/pgsql/9.0/data/"
the command above is highly dangerous because it copies data over the network link over its own data... if you have transactions runing during the command above, you will get a crash (and, in my case, I had duplicate primary keys).
Would be better to check if this could be happening to you... some script overwriting data using rsync, cp, etc... I had no other situation where Postgresql allowed duplicate keys.
Hope this helps,
Edson.thanks for the help!On Thu, Nov 17, 2011 at 5:38 PM, Edson Richter <richter@simkorp.com.br> wrote:
Em 17-11-2011 03:19, Yan Chunlu escreveu:I know one scenario this can happen on Linux. In my case, it was caused by a "rsync"... instead copy to a different location, script was copying pg_xlog over own pg_xlog.recently I have found several tables has exactly the same pkey, here is the definition:
"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
the data is like this:
159292 | funnypics_link_point | 41 | num
159292 | funnypics_link_point | 40 | num
I could not even update this record.
really confused about how could this happen... thanks!
I did this stupidity once, and learned for a life time. Lost two hours of work to recover everything (from backup, at least I had one).
Be careful with rsync and cp, since Linux does not block files from being overwriten even when they are in use.
Regards,
Edson.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Monday, November 21, 2011 6:39:55 am Yan Chunlu wrote: > got it. thank you very much for you help. I found out this problem too > late, and there is no backup. > > luckily there was not too much data for this, and my app keeps running > without error. > > I am not sure if they are related but I could not use pg_restore to import > data dumped by "pg_dump -Fc"; > > pg_restore will print some error message about "duplicate primary key", and > the table is empty. no data has been imported. > > pg_restore supposed to import the data and ignore the errors. does any > one have the similar problem? > pg_restore may ignore the error and keep on going but the database will not. In other words when pg_restore receives the error it will continue on to the next item (unless you have the -e switch on). As far as the server(database) is concerned duplicate primary key is still an error and the data will not be loaded. -- Adrian Klaver adrian.klaver@gmail.com
and database will stop receiving the following data after detected an error?
that means while using pg_restore, no error allowed to happen, otherwise the database will stop receiving data and the import will fail.
I found only one record in psql's log:
duplicate key value violates unique constraint "account_pkey"
does that means one duplicate record will prevent all other records to import?
On Mon, Nov 21, 2011 at 10:55 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Monday, November 21, 2011 6:39:55 am Yan Chunlu wrote:pg_restore may ignore the error and keep on going but the database will not. In
> got it. thank you very much for you help. I found out this problem too
> late, and there is no backup.
>
> luckily there was not too much data for this, and my app keeps running
> without error.
>
> I am not sure if they are related but I could not use pg_restore to import
> data dumped by "pg_dump -Fc";
>
> pg_restore will print some error message about "duplicate primary key", and
> the table is empty. no data has been imported.
>
> pg_restore supposed to import the data and ignore the errors. does any
> one have the similar problem?
>
other words when pg_restore receives the error it will continue on to the next
item (unless you have the -e switch on). As far as the server(database) is
concerned duplicate primary key is still an error and the data will not be
loaded.
--
Adrian Klaver
adrian.klaver@gmail.com
On Monday, November 21, 2011 4:53:21 pm Yan Chunlu wrote: > and database will stop receiving the following data after detected an > error? > that means while using pg_restore, no error allowed to happen, otherwise > the database will stop receiving data and the import will fail. > > I found only one record in psql's log: > > duplicate key value violates unique constraint "account_pkey" > > does that means one duplicate record will prevent all other records to > import? For that table yes. Though if that table is the parent in FK relationships with other tables, those tables will fail to import also because the keys they refer to do not exist. To get around this you have several options: 1) Find the duplicate entry(s) in the original table and eliminate them before dumping. 2) Dump the table by itself to a plain text format and eliminate the duplicate(s) in the plain text file before restoring. 3) By default pg_dump uses COPY to load data into tables. As you have found out that runs as a single transaction and rollbacks if there is an error. You can specify --insert to the pg_dump command to get it to output INSERT(s) for each row. The up side is each INSERT is a separate transaction. The down side is if there is a lot of data it will take a long time to load because each INSERT is a separate transaction. 4) Use pgloader (http://pgfoundry.org/projects/pgloader/). It is a Python program that 'manages' COPY. It will kick out bad rows and keep loading data. -- Adrian Klaver adrian.klaver@gmail.com