> On 19 Sep 2017, at 15:47, Job <Job@colliniconsulting.it> wrote:
>
> Hi guys,
>
> we need to insert from a table to another (Postgresql 9.6.1) a large amount of data (about 10/20 millions of rows)
withoutlocking destination table.
> Pg_bulkload is the fastest way but it locks the table.
>
> Are there other ways?
> Classic "COPY" from?
We do something like that using a staging table to load to initially (although not bulk; data arrives in our staging
tablewith batches of 5k to 100k rows) and then we transfer the data using insert/select and "on conflict do".
That data-transfer within PG takes a couple of minutes on our rather limited VM for a wide 37M rows table (~37GB on
disk).That only locks the staging table (during initial bulkload) and the rows in the master table that are currently
beingaltered (during the insert/select).
If your data-source is a file in a format supported by COPY, then you can use COPY to do the initial bulk load into the
stagingtable.
Some benefits of this 2-stage approach are that it leaves room to manipulate the data (type conversions, for example)
andthat it can handle the scenario where a matching target record in the master table already exists. In our case, we
convertcharacter fields to varchar (which saves a lot of space(s)).
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general