Обсуждение: Postgres restore help

Поиск

Список

Период

Сортировка

Postgres restore help

От

Piyush Lenka

Дата:

28 февраля 2012 г., 11:40:39

Hi,

I m trying to restore only data from a backup file.but i hav some primary key that already exists in the table.

I want to restore non duplicate data only.

Please Help.

Thanks And Regards

Piyush

Re: Postgres restore help

От

Alessandro Gagliardi

Дата:

28 февраля 2012 г., 14:30:36

I would recommend restoring the table to a temporary table and then using something like:

INSERT INTO main_table SELECT temp_table.* FROM temp_table JOIN main_table USING (pk) WHERE main_table.pk IS NULL

Someone else here might have a more efficient way of phrasing that insert statement (I'm still fairly new to the concept of anti-joins). But I think something like this should work for you.

-Alessandro

On Tue, Feb 28, 2012 at 7:34 AM, Piyush Lenka <lenka.piyush@gmail.com> wrote:

Hi,
I m trying to restore only data from a backup file.but i hav some primary key that already exists in the table.
I want to restore non duplicate data only.
Please Help.

Thanks And Regards
Piyush

Re: Postgres restore help

От

Bartosz Dmytrak

Дата:

28 февраля 2012 г., 16:14:09

This is almost perfect :)

but won't work in that case in the way You expect, because of join type. This should be LEFT JOIN to select all from temp_table (left) and matching from main_table (right), then WHERE clausule will filter rows not existing in main_table. Without USING, which is shorthand (http://www.postgresql.org/docs/9.1/static/sql-select.html#SQL-FROM) it should look like this:

INSERT INTO main_table

SELECT temp_table.*

FROM temp_table

LEFT JOIN main_table ON (temp_table.pk = main_table.pk) /* instead of "ON (...)" you can use Alessandro's shorthand: "USING (pk)"*/

WHERE main_table.pk IS NULL;

Regards,

Bartek

2012/2/28 Alessandro Gagliardi <alessandro@path.com>

I would recommend restoring the table to a temporary table and then using something like:
INSERT INTO main_table SELECT temp_table.* FROM temp_table JOIN main_table USING (pk) WHERE main_table.pk IS NULL
Someone else here might have a more efficient way of phrasing that insert statement (I'm still fairly new to the concept of anti-joins). But I think something like this should work for you.

-Alessandro

On Tue, Feb 28, 2012 at 7:34 AM, Piyush Lenka <lenka.piyush@gmail.com> wrote:
Hi,
I m trying to restore only data from a backup file.but i hav some primary key that already exists in the table.
I want to restore non duplicate data only.
Please Help.

Thanks And Regards
Piyush

Re: Postgres restore help

От

Alessandro Gagliardi

Дата:

28 февраля 2012 г., 17:35:11

Argh! I can't believe I forgot the "LEFT". That's embarrassing. Anyway, I was wondering if a WHERE NOT EXISTS clause would be better. I'm still new to those, so didn't want to try to offer that as a solution, but I gather it can be more efficient than a JOIN in some cases.

On Tue, Feb 28, 2012 at 12:13 PM, Bartosz Dmytrak <bdmytrak@eranet.pl> wrote:

This is almost perfect :)
but won't work in that case in the way You expect, because of join type. This should be LEFT JOIN to select all from temp_table (left) and matching from main_table (right), then WHERE clausule will filter rows not existing in main_table. Without USING, which is shorthand (http://www.postgresql.org/docs/9.1/static/sql-select.html#SQL-FROM) it should look like this:

INSERT INTO main_table
SELECT temp_table.*
FROM temp_table
LEFT JOIN main_table ON (temp_table.pk = main_table.pk) /* instead of "ON (...)" you can use Alessandro's shorthand: "USING (pk)"*/
WHERE main_table.pk IS NULL;

Regards,
Bartek

2012/2/28 Alessandro Gagliardi <alessandro@path.com>
I would recommend restoring the table to a temporary table and then using something like:
INSERT INTO main_table SELECT temp_table.* FROM temp_table JOIN main_table USING (pk) WHERE main_table.pk IS NULL
Someone else here might have a more efficient way of phrasing that insert statement (I'm still fairly new to the concept of anti-joins). But I think something like this should work for you.

-Alessandro

On Tue, Feb 28, 2012 at 7:34 AM, Piyush Lenka <lenka.piyush@gmail.com> wrote:
Hi,
I m trying to restore only data from a backup file.but i hav some primary key that already exists in the table.
I want to restore non duplicate data only.
Please Help.

Thanks And Regards
Piyush

Re: Postgres restore help

От

Tom Lane

Дата:

28 февраля 2012 г., 17:50:17

Alessandro Gagliardi <alessandro@path.com> writes:
> Argh! I can't believe I forgot the "LEFT". That's embarrassing. Anyway, I
> was wondering if a WHERE NOT EXISTS clause would be better. I'm still new
> to those, so didn't want to try to offer that as a solution, but I gather
> it can be more efficient than a JOIN in some cases.

Given a reasonably recent PG version, it should be exactly the same ---
the planner will translate both of those into anti-joins.

            regards, tom lane

Re: Postgres restore help

От

Bartosz Dmytrak

Дата:

28 февраля 2012 г., 18:26:50

Let's have the tables like this:

CREATE TABLE "tblTest"

(

"RowId" serial NOT NULL,

"Value1" text,

"Value2" text,

"Value3" text,

"Value4" text,

"Value5" text,

"Value6" text,

CONSTRAINT "tblTest_pkey" PRIMARY KEY ("RowId" )

)

WITH (

OIDS=FALSE

);

and

CREATE TABLE "tblTest2"

(

"RowId" serial NOT NULL,

"Value1" text,

"Value2" text,

"Value3" text,

"Value4" text,

"Value5" text,

"Value6" text,

CONSTRAINT "tblTest2_pkey" PRIMARY KEY ("RowId" )

)

WITH (

OIDS=FALSE

);

I mean the same strcture, but with different number of rows and

tblTest - 999000

tblTest2 - 1000

let's try first

EXPLAIN

SELECT a.* FROM "tblTest" a

LEFT JOIN "tblTest2" b

ON (a."RowId" = b."RowId")

WHERE b."RowId" IS NULL;

'Hash Anti Join  (cost=70.50..69551.31 rows=998058 width=339)'
'  Hash Cond: (a."RowId" = b."RowId")'
'  ->  Seq Scan on "tblTest" a  (cost=0.00..56876.58 rows=999058 width=339)'
'  ->  Hash  (cost=58.00..58.00 rows=1000 width=4)'
'        ->  Seq Scan on "tblTest2" b  (cost=0.00..58.00 rows=1000 width=4)'

and:

EXPLAIN

SELECT a.* FROM "tblTest" a

WHERE NOT EXISTS (SELECT "RowId" FROM "tblTest2" b WHERE b."RowId" = a."RowId")

'Hash Anti Join (cost=70.50..69551.31 rows=998058 width=339)'

' Hash Cond: (a."RowId" = b."RowId")'

' -> Seq Scan on "tblTest" a (cost=0.00..56876.58 rows=999058 width=339)'

' -> Hash (cost=58.00..58.00 rows=1000 width=4)'

' -> Seq Scan on "tblTest2" b (cost=0.00..58.00 rows=1000 width=4)'

for me looks the same :) but I personally prefer first.

There is also "...WHERE "RowId" NOT IN..." solution, but in that case it is a little faster when You are looking for rows from tblTest

EXPLAIN

SELECT * FROM "tblTest"

WHERE "RowId" NOT IN (SELECT "RowId" FROM "tblTest2")

'Seq Scan on "tblTest" (cost=60.50..59434.72 rows=499529 width=339)'

' Filter: (NOT (hashed SubPlan 1))'

' SubPlan 1'

' -> Seq Scan on "tblTest2" (cost=0.00..58.00 rows=1000 width=4)'

but it is much, much slower in opposite way

EXPLAIN

SELECT * FROM "tblTest2"

WHERE "RowId" NOT IN (SELECT "RowId" FROM "tblTest")

'Seq Scan on "tblTest2" (cost=0.00..34136318.00 rows=500 width=344)'

' Filter: (NOT (SubPlan 1))'

' SubPlan 1'

' -> Materialize (cost=0.00..65774.87 rows=999058 width=4)'

' -> Seq Scan on "tblTest" (cost=0.00..56876.58 rows=999058 width=4)'

Regards,
Bartek

2012/2/28 Alessandro Gagliardi <alessandro@path.com>

Argh! I can't believe I forgot the "LEFT". That's embarrassing. Anyway, I was wondering if a WHERE NOT EXISTS clause would be better. I'm still new to those, so didn't want to try to offer that as a solution, but I gather it can be more efficient than a JOIN in some cases.

On Tue, Feb 28, 2012 at 12:13 PM, Bartosz Dmytrak <bdmytrak@eranet.pl> wrote:
This is almost perfect :)
but won't work in that case in the way You expect, because of join type. This should be LEFT JOIN to select all from temp_table (left) and matching from main_table (right), then WHERE clausule will filter rows not existing in main_table. Without USING, which is shorthand (http://www.postgresql.org/docs/9.1/static/sql-select.html#SQL-FROM) it should look like this:

INSERT INTO main_table
SELECT temp_table.*
FROM temp_table
LEFT JOIN main_table ON (temp_table.pk = main_table.pk) /* instead of "ON (...)" you can use Alessandro's shorthand: "USING (pk)"*/
WHERE main_table.pk IS NULL;

Regards,
Bartek

2012/2/28 Alessandro Gagliardi <alessandro@path.com>
I would recommend restoring the table to a temporary table and then using something like:
INSERT INTO main_table SELECT temp_table.* FROM temp_table JOIN main_table USING (pk) WHERE main_table.pk IS NULL
Someone else here might have a more efficient way of phrasing that insert statement (I'm still fairly new to the concept of anti-joins). But I think something like this should work for you.

-Alessandro

On Tue, Feb 28, 2012 at 7:34 AM, Piyush Lenka <lenka.piyush@gmail.com> wrote:
Hi,
I m trying to restore only data from a backup file.but i hav some primary key that already exists in the table.
I want to restore non duplicate data only.
Please Help.

Thanks And Regards
Piyush

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Postgres restore help

Postgres restore help

Re: Postgres restore help

Re: Postgres restore help

Re: Postgres restore help

Re: Postgres restore help

Re: Postgres restore help