Обсуждение: How to copy rows into same table efficiently

Поиск
Список
Период
Сортировка

How to copy rows into same table efficiently

От
Arun Suresh
Дата:

Dear PG experts,

We have a tenant discriminator column in our tables to enable storage of data from multiple
tenants. This column is also part of the composite primary key.
The customers may request creation of a tenant copy, which means if they currently have a
tenant id "ABCD", they would like to copy all data "where tenant_id = 'ABCD'" to a new copy tenant "XYZ".
The copy must also be done on the same table.

Current approach taken is to build a query like below:
INSERT INTO mytable (col1, col2, col3, col4) SELECT col1, 'XYZ', col3, col4 FROM mytable WHERE col2 = 'ABCD'

Is there a better way to do this?
There could be other tables with foreign key reference, would a simple ordering of the copy based on table relationship suffice?
Also if the table has millions of records, what are the things to take care of?


Regards,
Arun Suresh

Re: How to copy rows into same table efficiently

От
Ron
Дата:
On 10/26/21 1:04 AM, Arun Suresh wrote:
>
> Dear PG experts,
>
> We have a tenant discriminator column in our tables to enable storage of 
> data from multiple
> tenants. This column is also part of the composite primary key.
> The customers may request creation of a tenant copy, which means if they 
> currently have a
> tenant id "ABCD", they would like to copy all data "where tenant_id = 
> 'ABCD'" to a new copy tenant "XYZ".
> The copy must also be done on the same table.
>
> Current approach taken is to build a query like below:
> INSERT INTO mytable (col1, col2, col3, col4) SELECT col1, 'XYZ', col3, 
> col4 FROM mytable WHERE col2 = 'ABCD'
>
> Is there a better way to do this?
> There could be other tables with foreign key reference, would a simple 
> ordering of the copy based on table relationship suffice?

That would be my first thought.  Of course, FK dependencies quickly become a 
rat's nest, so beware.

> Also if the table has millions of records, what are the things to take 
> care of?

Take care of?

Anyway, for millions of rows, I might use COPY instead of INSERT (depending 
on how many millions, how many indices, how large the rows, how fast the 
machine, etc.

-- 
Angular momentum makes the world go 'round.



Re: How to copy rows into same table efficiently

От
"David G. Johnston"
Дата:
On Tue, Oct 26, 2021 at 2:06 AM Ron <ronljohnsonjr@gmail.com> wrote:
Anyway, for millions of rows, I might use COPY instead of INSERT (depending
on how many millions, how many indices, how large the rows, how fast the
machine, etc.


I don't imagine using COPY TO to write the data to a file and then COPY FROM to import it is going to be an improvement over INSERT-SELECT.

Now, if you can perform the COPY TO on a replica and then only run the COPY FROM on the primary that might be worth it.  Avoiding the I/O for the read on the primary would be a big win.

David J.

Re: How to copy rows into same table efficiently

От
Michael Lewis
Дата:
On Tue, Oct 26, 2021 at 1:07 AM Arun Suresh <arun.suresh.303@gmail.com> wrote:
Current approach taken is to build a query like below:
INSERT INTO mytable (col1, col2, col3, col4) SELECT col1, 'XYZ', col3, col4 FROM mytable WHERE col2 = 'ABCD'

Is there a better way to do this?
There could be other tables with foreign key reference, would a simple ordering of the copy based on table relationship suffice?

The other concern I would have is having a rolling view of the data in default read committed mode. If you copy data from a primary table (with other tables having fkey to that table coming later in the process), then you may need to use some created_on < script_start_time, else when you try to insert dependent rows where the record in the primary table did not exist yet when you started your process and your rows in dependent table want to reference that primary table row... error.

Or, use REPEATABLE READ, but WAL buildup may be a concern.