Обсуждение: pg_restore creates duplicate records when used with --roleparameter

Поиск
Список
Период
Сортировка

pg_restore creates duplicate records when used with --roleparameter

От
Ivan Pantić
Дата:
  Hello,

I've created a backup using pg_restore like this:

x sudo -u postgres pg_dump source_db --format=custom' > /tmp/source_db.db

I then copied the backup file to the destination server and tried to restore it like this:

sudo -u postgres psql destination_db -c 'DROP SCHEMA public CASCADE; CREATE SCHEMA public AUTHORIZATION destination_user;'
sudo -u postgres pg_restore -d destination_db --no-owner --role=destination_user -n public /tmp/source_db.db

I've recieved this error:

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2276; 2606 42862 CONSTRAINT price_history price_history_pkey destination_db
pg_restore: [archiver (db)] could not execute query: ERROR:  could not create unique index "price_history_pkey"
DETAIL:  Key (id)=(64) is duplicated.
    Command was: ALTER TABLE ONLY public.price_history
    ADD CONSTRAINT price_history_pkey PRIMARY KEY (id);



WARNING: errors ignored on restore: 1

Here is the DDL of the table in question:

CREATE TABLE price_history (
  id SERIAL NOT NULL
    CONSTRAINT price_history_pkey PRIMARY KEY,
  pair VARCHAR(21) NOT NULL,
  open NUMERIC(32, 16) NOT NULL,
  high NUMERIC(32, 16) NOT NULL,
  low NUMERIC(32, 16) NOT NULL,
  close NUMERIC(32, 16) NOT NULL,
  count INTEGER NOT NULL,
  volume NUMERIC(32, 16) NOT NULL,
  timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
  CONSTRAINT price_history_timestamp_pair_unique UNIQUE (timestamp, pair)
);

This is what I saw in the destination table:

id pair open high low close count volume timestamp
60 ETH_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
61 SFX_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
61 SFX_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
62 SFT_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
62 SFT_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
63 ETH_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
63 ETH_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
64 SFT_SFX 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
64 SFT_SFX 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
65 BTC_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000
66 SFX_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000
67 SFT_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000
68 ETH_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000
69 SFX_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000
70 SFT_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:52:00.000000

Excerpts from source table:

id pair open high low close count volume timestamp
57 BTC_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
63 ETH_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
60 ETH_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
62 SFT_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
59 SFT_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
64 SFT_SFX 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
61 SFX_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000
58 SFX_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-05-28 08:51:00.000000

id pair open high low close count volume timestamp
1625769 BTC_RSD 959304.0618596000000000 959304.0618596000000000 959304.0618596000000000 959304.0618596000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625775 ETH_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625772 ETH_RSD 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625774 SFT_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625771 SFT_RSD 55.0000000000000000 55.0000000000000000 55.0000000000000000 55.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625776 SFT_SFX 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625773 SFX_BTC 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000
1625770 SFX_RSD 60.0000000000000000 60.0000000000000000 60.0000000000000000 60.0000000000000000 0 0.0000000000000000 2019-10-16 11:45:00.000000

So it seems like touple 1625776 got inserted with an auto-generated id, which is duplicate of a different tuple's id.

I ran this import several times, all from the same backup file. It generates duplicate rows about 50% of times, usually from rows near the end of table (there is a total of 1627664 records).
The same happens when restoring from a different backup and from a backup in tar format, so I suspect the problem is in restore utility.

However, when I remove the "--role=destination_user" argument, the problem disappears. I wrote a script which tried delete/restore 10 times in a loop, and it worked each time.
So it seems there is some kind of race condition or something when using the "role" argument.

Systrem info:
pg_restore (PostgreSQL) 9.6.15
Linux vm-dev 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux

Regards,
Panta

Re: pg_restore creates duplicate records when used with --role parameter

От
Tom Lane
Дата:
=?utf-8?Q?Ivan_Panti=C4=87?= <ivanpantic82@gmail.com> writes:
> I've created a backup using pg_restore like this:
> x sudo -u postgres pg_dump source_db --format=custom' > /tmp/source_db.db
> I then copied the backup file to the destination server and tried to restore it like this:
> sudo -u postgres psql destination_db -c 'DROP SCHEMA public CASCADE; CREATE SCHEMA public AUTHORIZATION
destination_user;'
> sudo -u postgres pg_restore -d destination_db --no-owner --role=destination_user -n public /tmp/source_db.db

> I've recieved this error:
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 2276; 2606 42862 CONSTRAINT price_history price_history_pkey
destination_db
> pg_restore: [archiver (db)] could not execute query: ERROR: could not create unique index "price_history_pkey"
> DETAIL: Key (id)=(64) is duplicated.
> Command was: ALTER TABLE ONLY public.price_history
> ADD CONSTRAINT price_history_pkey PRIMARY KEY (id);

FWIW, I could not duplicate this report in several tries.

Usually, the expectation when restoring a dump is that you're restoring
into a totally empty database.  Clearing out just the public schema
isn't that, so I wonder if you left behind something that could affect
this table.  It's not very clear what that could be though :-(.

Anyway, if you can provide a self-contained reproduction case for this,
we'd surely look further.

            regards, tom lane



Re: pg_restore creates duplicate records when used with --roleparameter

От
Ivan Pantić
Дата:
Hi tom,

After playing with this a bit more, it seems something was wrong with a particular database I was trying to restore into.

When I created a fresh database, the restore worked, including after deleting the public schema and trying again. When I did a db dump of new and old database, the outputs were the same, nothing special in the old DB DDL. After deleting destination_db and creating it from scratch, the restore started working so it's not like its name is cursed or something.

Overall, it seems there was some kind corruption of the old database that causes this restore to fail.

Very strange and, I suspect, impossible to debug (especially given I no longer have the old database).

Thank you for your help.

Regards,
Panta

On Oct 18 2019, at 11:13 am, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ivan Pantić <ivanpantic82@gmail.com> writes:
I've created a backup using pg_restore like this:
x sudo -u postgres pg_dump source_db --format=custom' > /tmp/source_db.db
I then copied the backup file to the destination server and tried to restore it like this:
sudo -u postgres psql destination_db -c 'DROP SCHEMA public CASCADE; CREATE SCHEMA public AUTHORIZATION destination_user;'
sudo -u postgres pg_restore -d destination_db --no-owner --role=destination_user -n public /tmp/source_db.db

I've recieved this error:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2276; 2606 42862 CONSTRAINT price_history price_history_pkey destination_db
pg_restore: [archiver (db)] could not execute query: ERROR: could not create unique index "price_history_pkey"
DETAIL: Key (id)=(64) is duplicated.
Command was: ALTER TABLE ONLY public.price_history
ADD CONSTRAINT price_history_pkey PRIMARY KEY (id);

FWIW, I could not duplicate this report in several tries.

Usually, the expectation when restoring a dump is that you're restoring
into a totally empty database. Clearing out just the public schema
isn't that, so I wonder if you left behind something that could affect
this table. It's not very clear what that could be though :-(.

Anyway, if you can provide a self-contained reproduction case for this,
we'd surely look further.

regards, tom lane