Обсуждение: Table with seemingly duplicated primary key values

Поиск
Список
Период
Сортировка

Table with seemingly duplicated primary key values

От
Aleksander Łukasz
Дата:
Hi,

a table in our database with about 3 million rows ended up in a state
where its seems to have duplicated entries (duplicated primary key values):

# \d some_table;
                                   Table "public.some_table"
 Column |            Type             |                        Modifiers
--------+-----------------------------+---------------------------------------------------------
 id     | integer                     | not null default nextval('some_table_id_seq'::regclass)
 field1 | character varying(40)       |
 field2 | character varying(128)      |
 ts     | timestamp without time zone |

Indexes:
    "some_table_pkey" PRIMARY KEY, btree (id)
    "ix_some_table_field1" btree (field1)
    "ix_some_table_field2" btree (field2)
    "ix_some_table_ts" btree (ts)


# select id, field1, field2 from some_table where field1 is null and field2 is not null;
   id    | field1 |            field2
---------+--------+----------------------------------
 2141750 |        | some_value2
(1 row)


# select id, field1, field2 from some_table where id = 2141750;
   id   | field1       |    field2
---------+-------------+----------------------------------
2141750 | some_value1   | some_value2
(1 row)

Another way this manifests itself it that running this:

# update some_table
   set field2 = field1
where
   id = 2141750;

works perfectly fine (but doesn't change the result of the first two queries above),
but this results in an error:

# update some_table
   set field2 = field1
where
   field1 is not null
   and field2 is null
   and ts between '2015-12-01' and '2015-12-02';

ERROR:  duplicate key value violates unique constraint "some_table_pkey"
DETAIL:  Key (id)=(2141750) already exists.

Do you have any idea what could be happening and what measures should be
undertaken to fix this issue? Thanks.

Re: Table with seemingly duplicated primary key values

От
Tom Lane
Дата:
=?UTF-8?Q?Aleksander_=C5=81ukasz?= <allllllx@gmail.com> writes:
> a table in our database with about 3 million rows ended up in a state
> where its seems to have duplicated entries (duplicated primary key values):
> ...
> Do you have any idea what could be happening and what measures should be
> undertaken to fix this issue? Thanks.

Looks like corruption of the primary key index.  Try REINDEXing that
index.  This might fail if there actually are duplicate rows in the table
and not just incorrect pointers to them in the index, in which case you'll
have to clean up the duplicates somehow and try again till the REINDEX
succeeds.

            regards, tom lane


Re: Table with seemingly duplicated primary key values

От
Melvin Davidson
Дата:
Please. Always, ALWAYS, give the PostgreSQL version and O/S when reporting a problem.

First, WHAT IS THE POSTGRESQL VERSION?????
       WHAT IS THE O/S?????

       Then try this:
select a.ctid, a.id, a.field1,
          b.ctid, b.id, b.field1
  from some_table a,
         some_table b
 WHERE a.ctid <> b.ctid
   AND a.id = b.id;


On Tue, Dec 22, 2015 at 4:03 AM, Aleksander Łukasz <allllllx@gmail.com> wrote:
Hi,

a table in our database with about 3 million rows ended up in a state
where its seems to have duplicated entries (duplicated primary key values):

# \d some_table;
                                   Table "public.some_table"
 Column |            Type             |                        Modifiers
--------+-----------------------------+---------------------------------------------------------
 id     | integer                     | not null default nextval('some_table_id_seq'::regclass)
 field1 | character varying(40)       |
 field2 | character varying(128)      |
 ts     | timestamp without time zone |

Indexes:
    "some_table_pkey" PRIMARY KEY, btree (id)
    "ix_some_table_field1" btree (field1)
    "ix_some_table_field2" btree (field2)
    "ix_some_table_ts" btree (ts)


# select id, field1, field2 from some_table where field1 is null and field2 is not null;
   id    | field1 |            field2
---------+--------+----------------------------------
 2141750 |        | some_value2
(1 row)


# select id, field1, field2 from some_table where id = 2141750;
   id   | field1       |    field2
---------+-------------+----------------------------------
2141750 | some_value1   | some_value2
(1 row)

Another way this manifests itself it that running this:

# update some_table
   set field2 = field1
where
   id = 2141750;

works perfectly fine (but doesn't change the result of the first two queries above),
but this results in an error:

# update some_table
   set field2 = field1
where
   field1 is not null
   and field2 is null
   and ts between '2015-12-01' and '2015-12-02';

ERROR:  duplicate key value violates unique constraint "some_table_pkey"
DETAIL:  Key (id)=(2141750) already exists.

Do you have any idea what could be happening and what measures should be
undertaken to fix this issue? Thanks.




--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Table with seemingly duplicated primary key values

От
Aleksander Łukasz
Дата:
Hi, thanks for your reply.

2015-12-22 16:34 GMT+01:00 Melvin Davidson <melvin6925@gmail.com>:
Please. Always, ALWAYS, give the PostgreSQL version and O/S when reporting a problem.

First, WHAT IS THE POSTGRESQL VERSION?????
       WHAT IS THE O/S?????

# select version();
                                               version                                                
------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
(1 row)

 
       Then try this:
select a.ctid, a.id, a.field1,
          b.ctid, b.id, b.field1
  from some_table a,
         some_table b
 WHERE a.ctid <> b.ctid
   AND a.id = b.id;
 
    ctid    |   id    |               field1             |    ctid    |   id    |               field1                
------------+---------+----------------------------------+------------+---------+----------------------------------
 (79664,59) | 2141750 |                                  | (79772,23) | 2141750 | some_value1
 (79772,23) | 2141750 | some_value1                      | (79664,59) | 2141750 | 
(2 rows)
 

Re: Table with seemingly duplicated primary key values

От
Melvin Davidson
Дата:
The fact that you have rows with an identical id of 2141750 verifies a corrupted primary index.
To correct it, you need to decide which row to keep.

So review the results of
SELECT * FROM some_table WHERE ctid =  (79664,59) OR ctid = (79772,23)

DELETE FROM some_table
WHERE id = 2147150
    AND field1 = {value for field1 of row to delete}
    AND field2 = {value for field2 of row to delete}
...
...
   AND fieldn = {value for fieldn of row to delete};

Then
 REINDEX TABLE some_table;

Re: Table with seemingly duplicated primary key values

От
Jim Nasby
Дата:
On 12/22/15 1:31 PM, Melvin Davidson wrote:
> The fact that you have rows with an identical id of 2141750 verifies a
> corrupted primary index.
> To correct it, you need to decide which row to keep.
>
> So review the results of
> SELECT * FROM some_table WHERE ctid =  (79664,59) OR ctid = (79772,23)
>
> DELETE FROM some_table
> WHERE id = 2147150
>      AND field1 = {value for field1 of row to delete}
>      AND field2 = {value for field2 of row to delete}
> ...
> ...
>     AND fieldn = {value for fieldn of row to delete};
>
> Then
>   REINDEX TABLE some_table;

And most importantly: review your logs for hardware errors and your
Postgres and filesystem settings for anything dangerous. Index
corruption is not normal and indicates the underlying hardware or OS is
faulty (or maybe a bug in Postgres, but that's very unlikely).

You should also consider turning on page checksums if you haven't already.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Table with seemingly duplicated primary key values

От
Melvin Davidson
Дата:
BTW, Jim is referring to the O/S logs for hardware errors, not the PostgreSQL logs.

Also, another way of deleting the bad row would be
DELETE FROM some_table where ctid = '(79664,59)';

or

DELETE FROM some_table where ctid = '(79772,23)';

On Tue, Dec 22, 2015 at 9:44 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 12/22/15 1:31 PM, Melvin Davidson wrote:
The fact that you have rows with an identical id of 2141750 verifies a
corrupted primary index.
To correct it, you need to decide which row to keep.

So review the results of
SELECT * FROM some_table WHERE ctid =  (79664,59) OR ctid = (79772,23)

DELETE FROM some_table
WHERE id = 2147150
     AND field1 = {value for field1 of row to delete}
     AND field2 = {value for field2 of row to delete}
...
...
    AND fieldn = {value for fieldn of row to delete};

Then
  REINDEX TABLE some_table;

And most importantly: review your logs for hardware errors and your Postgres and filesystem settings for anything dangerous. Index corruption is not normal and indicates the underlying hardware or OS is faulty (or maybe a bug in Postgres, but that's very unlikely).

You should also consider turning on page checksums if you haven't already.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.