Обсуждение: Merging two database dumps

Поиск
Список
Период
Сортировка

Merging two database dumps

От
"Alex O'Ree"
Дата:
I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

Question is what are my other options to skip over the conflicting record when merging? 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?

Re: Merging two database dumps

От
Pavan Teja
Дата:
Hi Alex,
For storing duplicate rows, dropping primary and unique indexes is the only way.

One alternative is create a column with timestamp which updates on every insert/update so that timestamp will be primary. Hope it helps.

Regards,
Pavan

On Wed, Jun 13, 2018, 4:47 PM Alex O'Ree <spyhunter99@gmail.com> wrote:
I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

Question is what are my other options to skip over the conflicting record when merging? 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?

Re: [External] Merging two database dumps

От
Vijaykumar Jain
Дата:

You can try one option, although just a thought in the air 😊

Use postgres FDW  ex. https://robots.thoughtbot.com/postgres-foreign-data-wrapper

 

Create foreign tables in the relevant server schema

 

And then union/union all  😊 or your custom constraint on the destination table where you dump the rows.

 

For ex.

You have server1, server2, server3

 

And you have server4 as your new single server.

 

You create FDW of server1, server2, server3 on server 4 and then import table into respective server schema.

 

server1.table1, server2.table1, server3.table1

 

and then

insert into server4.table1  select * from( select * from server1.table1 union select * from server2.table1 union select * from server3.table1) a;

 

something 😊

 

 

Thanks,

Vijay

 

From: Alex O'Ree <spyhunter99@gmail.com>
Date: Wednesday, June 13, 2018 at 4:47 PM
To: "pgsql-general@lists.postgresql.org" <pgsql-general@lists.postgresql.org>
Subject: [External] Merging two database dumps

 

I have a situation with multiple postgres servers running all with the same databases and table structure. I need to periodically export the data from each of there then merge them all into a single server. On  occasion, it's feasible for the same record (primary key) to be stored in two or more servers

 

I was using pgdump without the --insert option however I just noticed that pgrestore will stop inserting into a table when the conflict occurs, leaving me with an incomplete set.

 

Question is what are my other options to skip over the conflicting record when merging? 

 

From the docs, it appears that making dumps with the --insert option may be the only way to go however performance is an issue. In this case would dropping all indexes help?

Re: Merging two database dumps

От
Andreas Kretschmer
Дата:

Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
> I have a situation with multiple postgres servers running all with the 
> same databases and table structure. I need to periodically export the 
> data from each of there then merge them all into a single server. On  
> occasion, it's feasible for the same record (primary key) to be stored 
> in two or more servers

what should happen in this case?

>
> I was using pgdump without the --insert option however I just noticed 
> that pgrestore will stop inserting into a table when the conflict 
> occurs, leaving me with an incomplete set.
>

Other solution:

* create the tables on the destination server without the PK or with an 
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your 
source-db's to the destination table, see more here: 
https://www.2ndquadrant.com/en/resources/pglogical/

your problem seems as a typical task for logical replication to me. You 
needs 9.4 at least.


Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



Re: [External] Merging two database dumps

От
pavan95
Дата:
Yes Vijay,

It might work, but I'm thinking it will be a performance overhead in case of
complex data.

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: Merging two database dumps

От
"Alex O'Ree"
Дата:
Desired behavior is to just log the error and continue the import using pgdump based copy commands 

The servers are not on the same network.  Sneaker net is the only way

On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <andreas@a-kretschmer.de> wrote:


Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
> I have a situation with multiple postgres servers running all with the
> same databases and table structure. I need to periodically export the
> data from each of there then merge them all into a single server. On 
> occasion, it's feasible for the same record (primary key) to be stored
> in two or more servers

what should happen in this case?

>
> I was using pgdump without the --insert option however I just noticed
> that pgrestore will stop inserting into a table when the conflict
> occurs, leaving me with an incomplete set.
>

Other solution:

* create the tables on the destination server without the PK or with an
other, new PK (maybe SERIAL)
* use logical replication to replicate the table from all your
source-db's to the destination table, see more here:
https://www.2ndquadrant.com/en/resources/pglogical/

your problem seems as a typical task for logical replication to me. You
needs 9.4 at least.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com


Re: Merging two database dumps

От
Adrian Klaver
Дата:
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
> Desired behavior is to just log the error and continue the import using 
> pgdump based copy commands

Each COPY is atomic so if any part of it fails the whole thing fails, so 
you will not be able to achieve what you want that way.

> 
> The servers are not on the same network.  Sneaker net is the only way
> 
> On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer 
> <andreas@a-kretschmer.de <mailto:andreas@a-kretschmer.de>> wrote:
> 
> 
> 
>     Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
>      > I have a situation with multiple postgres servers running all
>     with the
>      > same databases and table structure. I need to periodically export
>     the
>      > data from each of there then merge them all into a single server. On
>      > occasion, it's feasible for the same record (primary key) to be
>     stored
>      > in two or more servers
> 
>     what should happen in this case?
> 
>      >
>      > I was using pgdump without the --insert option however I just
>     noticed
>      > that pgrestore will stop inserting into a table when the conflict
>      > occurs, leaving me with an incomplete set.
>      >
> 
>     Other solution:
> 
>     * create the tables on the destination server without the PK or with an
>     other, new PK (maybe SERIAL)
>     * use logical replication to replicate the table from all your
>     source-db's to the destination table, see more here:
>     https://www.2ndquadrant.com/en/resources/pglogical/
> 
>     your problem seems as a typical task for logical replication to me. You
>     needs 9.4 at least.
> 
> 
>     Regards, Andreas
> 
>     -- 
>     2ndQuadrant - The PostgreSQL Support Company.
>     www.2ndQuadrant.com <http://www.2ndQuadrant.com>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Merging two database dumps

От
"Alex O'Ree"
Дата:
Thanks for the clarification

On Wed, Jun 13, 2018 at 9:32 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 06/13/2018 06:21 AM, Alex O'Ree wrote:
Desired behavior is to just log the error and continue the import using pgdump based copy commands

Each COPY is atomic so if any part of it fails the whole thing fails, so you will not be able to achieve what you want that way.


The servers are not on the same network.  Sneaker net is the only way

On Wed, Jun 13, 2018, 7:42 AM Andreas Kretschmer <andreas@a-kretschmer.de <mailto:andreas@a-kretschmer.de>> wrote:



    Am 13.06.2018 um 13:17 schrieb Alex O'Ree:
     > I have a situation with multiple postgres servers running all
    with the
     > same databases and table structure. I need to periodically export
    the
     > data from each of there then merge them all into a single server. On
     > occasion, it's feasible for the same record (primary key) to be
    stored
     > in two or more servers

    what should happen in this case?

     >
     > I was using pgdump without the --insert option however I just
    noticed
     > that pgrestore will stop inserting into a table when the conflict
     > occurs, leaving me with an incomplete set.
     >

    Other solution:

    * create the tables on the destination server without the PK or with an
    other, new PK (maybe SERIAL)
    * use logical replication to replicate the table from all your
    source-db's to the destination table, see more here:
    https://www.2ndquadrant.com/en/resources/pglogical/

    your problem seems as a typical task for logical replication to me. You
    needs 9.4 at least.


    Regards, Andreas

    --     2ndQuadrant - The PostgreSQL Support Company.
    www.2ndQuadrant.com <http://www.2ndQuadrant.com>




--
Adrian Klaver
adrian.klaver@aklaver.com