Обсуждение: Re: Is there any limit on the number of rows to import using copy command
Re: Is there any limit on the number of rows to import using copy command
От
"sivapostgres@yahoo.com"
Дата:
Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
Here we try to transfer data from one database to another (remote) database.
Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
No issues while transferring smaller sized tables.
I here take one particular table [table1] which has 85000 records.
The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
2. Transfer the file to the remote location. No issues.
3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
Copy from command is
Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
1. The trigger in Table1 is disabled with all other constraints on.
2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.The above command goes on infinite loop, when
1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled. We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
I read in the documentation that the fastest way to transfer data is to use Copy command. And I couldn't find any limit in transferring data using that command. One could easily transfer millions of rows using this command.
Here are the triggers.
Trigger function, which is called from Table1 on After Insert, Update, Delete
Declare variety_code character varying(30);
Declare company_code character varying(10);
Declare branch_code character varying(10);
Declare location_fk character varying(32);
Declare opening_quantity numeric(17,3) ;
Declare modified_user character varying(50) ;
Declare modified_date timestamp without time zone ;
Declare modified_computer character varying(50);
BEGIN
If (TG_OP = 'INSERT') Then
company_code = NEW.companycode ;
branch_code = NEW.branchcode ;
location_fk = NEW.locationfk ;
variety_code = NEW.barcode ;
opening_quantity = Coalesce(NEW.openingquantity,0);
End If ;
If (TG_OP = 'UPDATE') Then
company_code = NEW.companycode ;
branch_code = NEW.branchcode ;
location_fk = NEW.locationfk ;
variety_code = NEW.barcode ;
opening_quantity = Coalesce(NEW.openingquantity,0) - OLD.openingquantity ;
modified_user = NEW.modifieduser ;
modified_date = NEW.modifieddate ;
modified_computer = NEW.modifiedcomputer ;
End If ;
If (TG_OP = 'DELETE') Then
company_code = OLD.companycode ;
branch_code = OLD.branchcode ;
location_fk = OLD.locationfk ;
variety_code = OLD.barcode ;
opening_quantity = OLD.openingquantity * -1 ;
modified_user = OLD.modifieduser ;
modified_date = OLD.modifieddate ;
modified_computer = OLD.modifiedcomputer ;
End If ;
If (Select Count(*)
From table2
WHERE companycode = company_code
AND branchcode = branch_code
AND locationfk = location_fk
AND barcode = variety_code ) > 0 Then
BEGIN
UPDATE table2
SET openingquantity = Coalesce(openingquantity,0) + opening_quantity,
modifieduser = modified_user, modifieddate = modified_date, modifiedcomputer = modified_computer
WHERE companycode = company_code
AND branchcode = branch_code
AND locationfk = location_fk
AND barcode = variety_code ;
END ;
Else
BEGIN
INSERT INTO public.table2(
barcodestockpk, companycode, branchcode, locationfk, barcode, baleopenheaderfk, lrentryheaderfk, lrentrydetailfk,
baleopendetailfk, lrentrydetailsequencenumber, baleopendetailsequencenumber, barcodeopeningstockfk, sequencenumber,
varietyfk, brandfk, modelfk, patternfk, shadefk, materialfk, finishfk, sizefk,
meterbreakup, unit, barcodesinglebulk, barcodeitem, effectiverate,
openingquantity, barcodedquantity, purchasereturnquantity, salesquantity, salescancellationquantity,
salesreturnquantity, salesreturncancellationquantity, stockissuequantity, stockreceiptquantity, locationissuequantity, locationreceiptquantity,
branchissuequantity, branchreceiptquantity, closingstock, salesrate, mrprate, labelrate, ratecode, discountpercent, discountrate,
defectiveitem, locked, insertuser, insertdate, insertcomputer,
modifieduser, modifieddate, modifiedcomputer, wsrate, reversecalculation, hsnnumber)
VALUES ( replace(uuid_generate_v4()::text, '-', ''), company_code , branch_code, location_fk, variety_code, Null, Null,Null,
Null, Null, Null, NEW.barcodeopeningstockpk, NEW.Sequencenumber,
NEW.varietyfk, NEW.brandfk, NEW.modelfk, NEW.patternfk, NEW.shadefk, NEW.materialfk, NEW.finishfk, NEW.sizefk,
NEW.meterbreakup, NEW.unit, NEW.barcodesinglebulk, NEW.barcodeitem, NEW.effectiverate,
opening_quantity, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, NEW.salesrate, NEW.mrprate, NEW.labelrate, NEW.ratecode, 0, 0,
'N', NEW.locked, NEW.insertuser, NEW.insertdate, NEW.insertcomputer,
NEW.modifieduser, NEW.modifieddate, NEW.modifiedcomputer, NEW.wsrate, 'N', NEW.hsnnumber);
END ;
End IF ;
RETURN NEW ;
END ;
Trigger functions in Table 2
First Trigger, which calculates the closing stock, on before insert, update
BEGIN
If (TG_OP = 'INSERT') Then
NEW.closingstock = coalesce(NEW.barcodedquantity,0) +
coalesce(NEW.openingquantity,0) -
coalesce(NEW.salesquantity,0) +
coalesce(NEW.salesreturnquantity,0) +
coalesce(NEW.salescancellationquantity,0) -
coalesce(NEW.salesreturncancellationquantity,0) -
coalesce(NEW.purchasereturnquantity,0) -
coalesce(NEW.stockissuequantity,0) +
coalesce(NEW.stockreceiptquantity,0) -
coalesce(NEW.locationissuequantity,0) +
coalesce(NEW.locationreceiptquantity,0) -
coalesce(NEW.branchissuequantity,0) +
coalesce(NEW.branchreceiptquantity,0) ;
Return New ;
End If ;
If (TG_OP = 'UPDATE') Then
NEW.closingstock = coalesce(NEW.barcodedquantity,0) +
coalesce(NEW.openingquantity,0) -
coalesce(NEW.salesquantity,0) +
coalesce(NEW.salesreturnquantity,0) +
coalesce(NEW.salescancellationquantity,0) -
coalesce(NEW.salesreturncancellationquantity,0) -
coalesce(NEW.purchasereturnquantity,0) -
coalesce(NEW.stockissuequantity,0) +
coalesce(NEW.stockreceiptquantity,0) -
coalesce(NEW.locationissuequantity,0) +
coalesce(NEW.locationreceiptquantity,0) -
coalesce(NEW.branchissuequantity,0) +
coalesce(NEW.branchreceiptquantity,0) ;
Return New ;
End If ;
END
Begin
If Coalesce(NEW.openingquantity,0) = 0 and
Coalesce(NEW.barcodedquantity,0) = 0 and
Coalesce(NEW.salesquantity,0) = 0 and
Coalesce(NEW.salescancellationquantity,0) = 0 and
Coalesce(NEW.salesreturnquantity,0) = 0 and
Coalesce(NEW.salesreturncancellationquantity,0) = 0 and
Coalesce(NEW.purchasereturnquantity,0) = 0 and
Coalesce(NEW.stockissuequantity,0) = 0 and
Coalesce(NEW.stockreceiptquantity,0) = 0 and
Coalesce(NEW.locationissuequantity,0) = 0 and
Coalesce(NEW.locationreceiptquantity,0) = 0 and
Coalesce(NEW.branchissuequantity,0) = 0 and
Coalesce(NEW.branchreceiptquantity,0) = 0 Then
Delete
From tx_barcode_stock
Where barcodestockpk = new.barcodestockpk ;
End If ;
Return New ;
END
Any (other) suggestion to transfer successfully is really appreciated.
Happiness Always
BKR Sivaprakash
BKR Sivaprakash
On Wed, 2025-07-23 at 08:50 +0000, sivapostgres@yahoo.com wrote:
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
Both of these choices are unsavory. Don't use the unsupported v11,
and use 15.13 with v15.
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the
85000records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled.
> We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command.
> And I couldn't find any limit in transferring data using that command.
> One could easily transfer millions of rows using this command.
There is no limit for the number of rows that get created by a single COPY.
You should research why processing fails for higher row counts:
- Are there any messages on the client or the server side?
- Is the backend process on the server busy (consuming CPU) when processing hangs?
- Do you see locks or other wait events in "pg_stat_activity"?
> Here are the triggers.
>
> Trigger function, which is called from Table1 on After Insert, Update, Delete
One thing you could try is a BEFORE trigger. That should work the same, unless
there are foreign key constraints. Do you see high memory usage or paging for
the backend process when the COPY hangs?
> [...]
> If (Select Count(*)
> From table2
> WHERE companycode = company_code
> AND branchcode = branch_code
> AND locationfk = location_fk
> AND barcode = variety_code ) > 0 Then
> [...]
That may well be slow, particularly without a matching index.
A better way to write that would be
IF EXISTS (SELECT 1 FROM table2
WHERE ...)
because that can stop processing after the first match.
It still needs an index for fast processing.
Yours,
Laurenz Albe
Re: Is there any limit on the number of rows to import using copy command
От
"sivapostgres@yahoo.com"
Дата:
Thanks Laurenz Albe,
1. I tried running Copy From command from PGAdmin.
2. I ran pg_stat_activity also in another tab [ PGAdmin ].
What I observed,
1. In about 2 min, in the Dashboard of PGAdmin, the colour changed to Orange for that particular pid.
2. After few seconds, the colour again changed to Red.
3. The stat column in both Dashboard and pg_stat_activity.state shows 'active'.
4. No messages relevant to this pid in postgres' log files
5. CPU usage is normal, hovering around 36% overall, 15-18% for postgresql server, from the beginning and even after 3 min.
6. We didn't run any other application in that machine.
7. Unique Index is there in table2 which will return only one row for that Select Count(*) query.
8. No record is there in the target table when transfer started. This transfer is the first batch for that particular table.
9. Once color turned into Red, I could cancel the query execuion in PGAdmin, which immediately stops the execution.
10. I could not see any locks or blocking pids.
Happiness Always
BKR Sivaprakash
BKR Sivaprakash
On Wednesday 23 July, 2025 at 03:46:40 pm IST, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2025-07-23 at 08:50 +0000, sivapostgres@yahoo.com wrote:
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
Both of these choices are unsavory. Don't use the unsupported v11,
and use 15.13 with v15.
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled.
> We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command.
> And I couldn't find any limit in transferring data using that command.
> One could easily transfer millions of rows using this command.
There is no limit for the number of rows that get created by a single COPY.
You should research why processing fails for higher row counts:
- Are there any messages on the client or the server side?
- Is the backend process on the server busy (consuming CPU) when processing hangs?
- Do you see locks or other wait events in "pg_stat_activity"?
> Here are the triggers.
>
> Trigger function, which is called from Table1 on After Insert, Update, Delete
One thing you could try is a BEFORE trigger. That should work the same, unless
there are foreign key constraints. Do you see high memory usage or paging for
the backend process when the COPY hangs?
> [...]
> If (Select Count(*)
> From table2
> WHERE companycode = company_code
> AND branchcode = branch_code
> AND locationfk = location_fk
> AND barcode = variety_code ) > 0 Then
> [...]
That may well be slow, particularly without a matching index.
A better way to write that would be
IF EXISTS (SELECT 1 FROM table2
WHERE ...)
because that can stop processing after the first match.
It still needs an index for fast processing.
Yours,
Laurenz Albe
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
Both of these choices are unsavory. Don't use the unsupported v11,
and use 15.13 with v15.
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled.
> We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command.
> And I couldn't find any limit in transferring data using that command.
> One could easily transfer millions of rows using this command.
There is no limit for the number of rows that get created by a single COPY.
You should research why processing fails for higher row counts:
- Are there any messages on the client or the server side?
- Is the backend process on the server busy (consuming CPU) when processing hangs?
- Do you see locks or other wait events in "pg_stat_activity"?
> Here are the triggers.
>
> Trigger function, which is called from Table1 on After Insert, Update, Delete
One thing you could try is a BEFORE trigger. That should work the same, unless
there are foreign key constraints. Do you see high memory usage or paging for
the backend process when the COPY hangs?
> [...]
> If (Select Count(*)
> From table2
> WHERE companycode = company_code
> AND branchcode = branch_code
> AND locationfk = location_fk
> AND barcode = variety_code ) > 0 Then
> [...]
That may well be slow, particularly without a matching index.
A better way to write that would be
IF EXISTS (SELECT 1 FROM table2
WHERE ...)
because that can stop processing after the first match.
It still needs an index for fast processing.
Yours,
Laurenz Albe
You might also consider Backup/Restore. It appears you're not concerned with data being inserted into the source table after the backup is complete? If so, you can then easily sync the two post restore.
Thanks Laurenz Albe,1. I tried running Copy From command from PGAdmin.2. I ran pg_stat_activity also in another tab [ PGAdmin ].What I observed,1. In about 2 min, in the Dashboard of PGAdmin, the colour changed to Orange for that particular pid.2. After few seconds, the colour again changed to Red.3. The stat column in both Dashboard and pg_stat_activity.state shows 'active'.4. No messages relevant to this pid in postgres' log files5. CPU usage is normal, hovering around 36% overall, 15-18% for postgresql server, from the beginning and even after 3 min.6. We didn't run any other application in that machine.7. Unique Index is there in table2 which will return only one row for that Select Count(*) query.8. No record is there in the target table when transfer started. This transfer is the first batch for that particular table.9. Once color turned into Red, I could cancel the query execuion in PGAdmin, which immediately stops the execution.10. I could not see any locks or blocking pids.Happiness Always
BKR SivaprakashOn Wednesday 23 July, 2025 at 03:46:40 pm IST, Laurenz Albe <laurenz.albe@cybertec.at> wrote:On Wed, 2025-07-23 at 08:50 +0000, sivapostgres@yahoo.com wrote:
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
Both of these choices are unsavory. Don't use the unsupported v11,
and use 15.13 with v15.
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled.
> We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command.
> And I couldn't find any limit in transferring data using that command.
> One could easily transfer millions of rows using this command.
There is no limit for the number of rows that get created by a single COPY.
You should research why processing fails for higher row counts:
- Are there any messages on the client or the server side?
- Is the backend process on the server busy (consuming CPU) when processing hangs?
- Do you see locks or other wait events in "pg_stat_activity"?
> Here are the triggers.
>
> Trigger function, which is called from Table1 on After Insert, Update, Delete
One thing you could try is a BEFORE trigger. That should work the same, unless
there are foreign key constraints. Do you see high memory usage or paging for
the backend process when the COPY hangs?
> [...]
> If (Select Count(*)
> From table2
> WHERE companycode = company_code
> AND branchcode = branch_code
> AND locationfk = location_fk
> AND barcode = variety_code ) > 0 Then
> [...]
That may well be slow, particularly without a matching index.
A better way to write that would be
IF EXISTS (SELECT 1 FROM table2
WHERE ...)
because that can stop processing after the first match.
It still needs an index for fast processing.
Yours,
Laurenz Albe
On Wed, 2025-07-23 at 11:55 +0000, sivapostgres@yahoo.com wrote: > 1. I tried running Copy From command from PGAdmin. > 2. I ran pg_stat_activity also in another tab [ PGAdmin ]. > > What I observed, > 1. In about 2 min, in the Dashboard of PGAdmin, the colour changed to Orange for that particular pid. > 2. After few seconds, the colour again changed to Red. > 3. The stat column in both Dashboard and pg_stat_activity.state shows 'active'. > 4. No messages relevant to this pid in postgres' log files > 5. CPU usage is normal, hovering around 36% overall, 15-18% for postgresql server, from the beginning and even after 3min. > 6. We didn't run any other application in that machine. > 7. Unique Index is there in table2 which will return only one row for that Select Count(*) query. > 8. No record is there in the target table when transfer started. This transfer is the first batch for that particulartable. > 9. Once color turned into Red, I could cancel the query execuion in PGAdmin, which immediately stops the execution. > 10. I could not see any locks or blocking pids. As far as I know, red in pgAdmin (I don't use it) *is* the indication for a wait event. So look again what the columns "wait_event_type" and "wait_event" in "pg_stat_activity" show. Yours, Laurenz Albe
On 7/23/25 01:50, sivapostgres@yahoo.com wrote: > Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10 > > The above command succeeds, when > 1. The trigger in Table1 is disabled with all other constraints on. > 2. The no. of rows is within 16000 or less, with Trigger enabled. We > haven't tried with higher no of rows. Why not? > > The above command goes on infinite loop, when > 1. We try to transfer all 85000 rows at once, with Trigger and other > constraints in table1 enabled. We waited for 1.5 hrs first time and 2.5 > hrs second time before cancelling the operation. Try with the triggers disabled. > > I read in the documentation that the fastest way to transfer data is to > use Copy command. And I couldn't find any limit in transferring data > using that command. One could easily transfer millions of rows using > this command. It is, I have used it for much larger datasets then 85000 rows and it completed in less time. As example using Duckdb it took the NYC taxi data set yellow_tripdata_2023-09.parquet, transformed it and loaded using COPY in 5.4 secs for ~2.8 million rows. FYI, BEGIN in plpgsql is not the same as in SQL. In plpgsql it represents a block. I don't think you need the BEGIN/END around the UPDATE and INSERT queries. See https://www.postgresql.org/docs/current/plpgsql-structure.html for more information. > Any (other) suggestion to transfer successfully is really appreciated. > > Happiness Always > BKR Sivaprakash > -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Jul 23, 2025 at 2:51 AM sivapostgres@yahoo.com <sivapostgres@yahoo.com> wrote: > > Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10 > > Here we try to transfer data from one database to another (remote) database. > > Tables do have records ranging from 85000 to 3600000 along with smaller sized tables. > No issues while transferring smaller sized tables. > > I here take one particular table [table1] which has 85000 records. > The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2] > Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero. > > 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues. > 2. Transfer the file to the remote location. No issues. > 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000records at once. > > Copy from command is > > Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE) > > The above command succeeds, when > 1. The trigger in Table1 is disabled with all other constraints on. > 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows. > > The above command goes on infinite loop, when > 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled. We waited for 1.5hrs first time and 2.5 hrs second time before cancelling the operation. > > I read in the documentation that the fastest way to transfer data is to use Copy command. And I couldn't find any limitin transferring data using that command. One could easily transfer millions of rows using this command. Most likely, you are getting yourself into trouble with the trigger dependencies. Triggers are powerful, but also can be dangerous, and this could be 'wrong tool for the job' situation. Here are some general tips: * pg_trigger_depth(): can tell you if trigger A calls trigger B and back to trigger A, etc. you can use it with raise notify, and also use it to guard execution on CREATE TRIGGER * reconfiguring your logic to statement level triggers can be a good idea. this can take some thinking, but can be much more efficient when bulk processing since trigger execution can be deferred until the load completes. (one trick is to use now() to check for records inserted since it is stable though the transaction) * reconfiguring your logic to a procedure can be a better idea; COPY your data into some staging tables (perhaps temp, and indexed), then write to various tables with joins, upserts, etc. merlin
Re: Is there any limit on the number of rows to import using copy command
От
"sivapostgres@yahoo.com"
Дата:
Thanks Merlin, adrain, Laurenz
As a testcase, I split the trigger function into three, one each for insert, update, delete, each called from a separate trigger.
IT WORKS!.
Shouldn't we have one trigger function for all the three trigger events? Is it prohibited for bulk insert like this?
I tried this in PGAdmin only, will complete the testing from the program which we are developing, after my return from holiday.
Happiness Always
BKR Sivaprakash
BKR Sivaprakash
On Wednesday 23 July, 2025 at 11:20:03 pm IST, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Jul 23, 2025 at 2:51 AM sivapostgres@yahoo.com
<sivapostgres@yahoo.com> wrote:
>
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
>
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled. We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command. And I couldn't find any limit in transferring data using that command. One could easily transfer millions of rows using this command.
Most likely, you are getting yourself into trouble with the trigger
dependencies. Triggers are powerful, but also can be dangerous, and
this could be 'wrong tool for the job' situation.
Here are some general tips:
* pg_trigger_depth(): can tell you if trigger A calls trigger B and
back to trigger A, etc. you can use it with raise notify, and also
use it to guard execution on CREATE TRIGGER
* reconfiguring your logic to statement level triggers can be a good
idea. this can take some thinking, but can be much more efficient when
bulk processing since trigger execution can be deferred until the load
completes. (one trick is to use now() to check for records inserted
since it is stable though the transaction)
* reconfiguring your logic to a procedure can be a better idea; COPY
your data into some staging tables (perhaps temp, and indexed), then
write to various tables with joins, upserts, etc.
merlin
<sivapostgres@yahoo.com> wrote:
>
> Tried in PostgreSQL 11.11 , PostgreSQL 15.2 in Windows 10
>
> Here we try to transfer data from one database to another (remote) database.
>
> Tables do have records ranging from 85000 to 3600000 along with smaller sized tables.
> No issues while transferring smaller sized tables.
>
> I here take one particular table [table1] which has 85000 records.
> The table got Primary Key, Foreign Key(s), Triggers. Trigger updates another table [table2]
> Table2 have 2 triggers, one to arrive a closing value and other to delete, if the closing value is zero.
>
> 1. Transfer the data from source database to a csv file. 85000 records transferred. No issues.
> 2. Transfer the file to the remote location. No issues.
> 3. Transfer the contents of the file to the table using Copy From command. - Fails when try to transfer all the 85000 records at once.
>
> Copy from command is
>
> Copy public.table1 From 'E:\temp\file1.csv' (FORMAT CSV, DELIMITER ',', HEADER TRUE)
>
> The above command succeeds, when
> 1. The trigger in Table1 is disabled with all other constraints on.
> 2. The no. of rows is within 16000 or less, with Trigger enabled. We haven't tried with higher no of rows.
>
> The above command goes on infinite loop, when
> 1. We try to transfer all 85000 rows at once, with Trigger and other constraints in table1 enabled. We waited for 1.5 hrs first time and 2.5 hrs second time before cancelling the operation.
>
> I read in the documentation that the fastest way to transfer data is to use Copy command. And I couldn't find any limit in transferring data using that command. One could easily transfer millions of rows using this command.
Most likely, you are getting yourself into trouble with the trigger
dependencies. Triggers are powerful, but also can be dangerous, and
this could be 'wrong tool for the job' situation.
Here are some general tips:
* pg_trigger_depth(): can tell you if trigger A calls trigger B and
back to trigger A, etc. you can use it with raise notify, and also
use it to guard execution on CREATE TRIGGER
* reconfiguring your logic to statement level triggers can be a good
idea. this can take some thinking, but can be much more efficient when
bulk processing since trigger execution can be deferred until the load
completes. (one trick is to use now() to check for records inserted
since it is stable though the transaction)
* reconfiguring your logic to a procedure can be a better idea; COPY
your data into some staging tables (perhaps temp, and indexed), then
write to various tables with joins, upserts, etc.
merlin
On 7/24/25 05:18, sivapostgres@yahoo.com wrote: > Thanks Merlin, adrain, Laurenz > > As a testcase, I split the trigger function into three, one each for > insert, update, delete, each called from a separate trigger. > > IT WORKS!. It worked before, it just slowed down as your cases got bigger. You need to provide more information on what test case you used and how you define worked. > > Shouldn't we have one trigger function for all the three trigger > events? Is it prohibited for bulk insert like this? No. Triggers are overhead and they add to the processing that need to be done for moving the data into the table. Whether that is an issue is a case by case determination. > > I tried this in PGAdmin only, will complete the testing from the program > which we are developing, after my return from holiday. From Merlin Moncure's post: "* reconfiguring your logic to a procedure can be a better idea; COPY your data into some staging tables (perhaps temp, and indexed), then write to various tables with joins, upserts, etc." I would suggest looking into implementing the above. > > Happiness Always > BKR Sivaprakash > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Is there any limit on the number of rows to import using copy command
От
"sivapostgres@yahoo.com"
Дата:
Those two columns have null values when the color turns into Red.
And the status column shows 'active' only. All those three column values are same (null, null, active) even after 2 hours!
It works when I split the trigger function into three, one each for insert, delete and update call from three different triggers. I've replied that separately.
Happiness Always
BKR Sivaprakash
BKR Sivaprakash
On Wednesday 23 July, 2025 at 07:41:06 pm IST, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2025-07-23 at 11:55 +0000, sivapostgres@yahoo.com wrote:
> 1. I tried running Copy From command from PGAdmin.
> 2. I ran pg_stat_activity also in another tab [ PGAdmin ].
>
> What I observed,
> 1. In about 2 min, in the Dashboard of PGAdmin, the colour changed to Orange for that particular pid.
> 2. After few seconds, the colour again changed to Red.
> 3. The stat column in both Dashboard and pg_stat_activity.state shows 'active'.
> 4. No messages relevant to this pid in postgres' log files
> 5. CPU usage is normal, hovering around 36% overall, 15-18% for postgresql server, from the beginning and even after 3 min.
> 6. We didn't run any other application in that machine.
> 7. Unique Index is there in table2 which will return only one row for that Select Count(*) query.
> 8. No record is there in the target table when transfer started. This transfer is the first batch for that particular table.
> 9. Once color turned into Red, I could cancel the query execuion in PGAdmin, which immediately stops the execution.
> 10. I could not see any locks or blocking pids.
As far as I know, red in pgAdmin (I don't use it) *is* the indication for a wait event.
So look again what the columns "wait_event_type" and "wait_event" in "pg_stat_activity" show.
Yours,
Laurenz Albe
> 1. I tried running Copy From command from PGAdmin.
> 2. I ran pg_stat_activity also in another tab [ PGAdmin ].
>
> What I observed,
> 1. In about 2 min, in the Dashboard of PGAdmin, the colour changed to Orange for that particular pid.
> 2. After few seconds, the colour again changed to Red.
> 3. The stat column in both Dashboard and pg_stat_activity.state shows 'active'.
> 4. No messages relevant to this pid in postgres' log files
> 5. CPU usage is normal, hovering around 36% overall, 15-18% for postgresql server, from the beginning and even after 3 min.
> 6. We didn't run any other application in that machine.
> 7. Unique Index is there in table2 which will return only one row for that Select Count(*) query.
> 8. No record is there in the target table when transfer started. This transfer is the first batch for that particular table.
> 9. Once color turned into Red, I could cancel the query execuion in PGAdmin, which immediately stops the execution.
> 10. I could not see any locks or blocking pids.
As far as I know, red in pgAdmin (I don't use it) *is* the indication for a wait event.
So look again what the columns "wait_event_type" and "wait_event" in "pg_stat_activity" show.
Yours,
Laurenz Albe
Re: Is there any limit on the number of rows to import using copy command
От
"sivapostgres@yahoo.com"
Дата:
1. Testcase. Created a new database, modified the triggers (split into three), populated required master data, lookup tables. Then transferred 86420 records. Checked whether all the 86420 records inserted in table1 and also whether the trigger created the required records in table2. Yes, it created.
2. In the test case above, the total time taken to insert 86420 records is 1.15 min only. Earlier (before splitting the triggers) we waited for more than 1.5 hrs first time and 2.5 hrs second time with no records inserted.
3. Regarding moving the logic to procedure. Won't the trigger work? Will it be a burden for 86420 records? It's working, if we insert few thousand records. After split of trigger function, it's working for 86420 records. Are triggers overhead for handling even 100000 records? In production system, the same (single) trigger is working with 3 millions of records. There might be better alternatives to triggers, but triggers should also work. IMHO.
4. Staging tables. Yes, I have done that in another case, where there was a need to add data / transform for few more columns. It worked like a charm. In this case, since there was no need for any other calculations (transformation), and with just column to column matching, I thought copy command will do.
Before splitting the trigger into three, we tried
1. Transferring data using DataWindow / PowerBuilder (that's the tool we use to develop our front end). With the same single trigger, it took few hours (more than 4 hours, exact time not noted down) to transfer the same 86420 records. (Datawindow fires insert statements for every row). Works, but the time taken is not acceptable.
2. Next, we split the larger csv file into 8, with each file containing 10,000 records and the last one with 16420 records. Copy command worked. Works, but the time taken to split the file not acceptable. We wrote a batch file to split the larger csv file. We felt batch file is easier to automate the whole process using PowerBuilder.
3. What we observed here, is insert statement succeeds and copy command fails, if the records exceed a certain no. Haven't arrived the exact number of rows when the copy command fails.
Will do further works after my return from a holiday.
Happiness Always
BKR Sivaprakash
BKR Sivaprakash
On Thursday 24 July, 2025 at 08:18:07 pm IST, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/24/25 05:18, sivapostgres@yahoo.com wrote:
> Thanks Merlin, adrain, Laurenz
>
> As a testcase, I split the trigger function into three, one each for
> insert, update, delete, each called from a separate trigger.
>
> IT WORKS!.
It worked before, it just slowed down as your cases got bigger. You need
to provide more information on what test case you used and how you
define worked.
>
> Shouldn't we have one trigger function for all the three trigger
> events? Is it prohibited for bulk insert like this?
No. Triggers are overhead and they add to the processing that need to be
done for moving the data into the table. Whether that is an issue is a
case by case determination.
>
> I tried this in PGAdmin only, will complete the testing from the program
> which we are developing, after my return from holiday.
From Merlin Moncure's post:
"* reconfiguring your logic to a procedure can be a better idea; COPY
your data into some staging tables (perhaps temp, and indexed), then
write to various tables with joins, upserts, etc."
I would suggest looking into implementing the above.
>
> Happiness Always
> BKR Sivaprakash
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> Thanks Merlin, adrain, Laurenz
>
> As a testcase, I split the trigger function into three, one each for
> insert, update, delete, each called from a separate trigger.
>
> IT WORKS!.
It worked before, it just slowed down as your cases got bigger. You need
to provide more information on what test case you used and how you
define worked.
>
> Shouldn't we have one trigger function for all the three trigger
> events? Is it prohibited for bulk insert like this?
No. Triggers are overhead and they add to the processing that need to be
done for moving the data into the table. Whether that is an issue is a
case by case determination.
>
> I tried this in PGAdmin only, will complete the testing from the program
> which we are developing, after my return from holiday.
From Merlin Moncure's post:
"* reconfiguring your logic to a procedure can be a better idea; COPY
your data into some staging tables (perhaps temp, and indexed), then
write to various tables with joins, upserts, etc."
I would suggest looking into implementing the above.
>
> Happiness Always
> BKR Sivaprakash
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On Thu, Jul 24, 2025 at 8:00 PM sivapostgres@yahoo.com <sivapostgres@yahoo.com> wrote:
[snip]
3. Regarding moving the logic to procedure. Won't the trigger work? Will it be a burden for 86420 records? It's working, if we insert few thousand records. After split of trigger function, it's working for 86420 records. Are triggers overhead for handling even 100000 records?
What causes the overhead when a trigger fires is all the work that it has to do.
Thus, if the SELECT or UPDATE statements in the triggers do not have good index support, then _that_ is what will slow down the trigger as the number of records gets bigger. Of course, the triggers are not slow, it's the statements in the trigger that are slow.
Likewise, if there are many many indices on the tables, that will slow things down too: more indices means more work!
Thus, drop as many indices as possible before loading data.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 7/24/25 16:59, sivapostgres@yahoo.com wrote: > 1. Testcase. Created a new database, modified the triggers (split into > three), populated required master data, lookup tables. Then transferred > 86420 records. Checked whether all the 86420 records inserted in table1 > and also whether the trigger created the required records in table2. > Yes, it created. > > 2. In the test case above, the total time taken to insert 86420 records > is 1.15 min only. Earlier (before splitting the triggers) we waited > for more than 1.5 hrs first time and 2.5 hrs second time with no records > inserted. > > 3. Regarding moving the logic to procedure. Won't the trigger work? > Will it be a burden for 86420 records? It's working, if we insert few > thousand records. After split of trigger function, it's working for > 86420 records. Are triggers overhead for handling even 100000 records? > In production system, the same (single) trigger is working with 3 > millions of records. There might be better alternatives to triggers, > but triggers should also work. IMHO. Reread this post, in the thread, from Laurenz Albe: https://www.postgresql.org/message-id/de08fd016dd9c630f65c52b80292550e0bcdea4c.camel%40cybertec.at > > 4. Staging tables. Yes, I have done that in another case, where there > was a need to add data / transform for few more columns. It worked like > a charm. In this case, since there was no need for any other > calculations (transformation), and with just column to column matching, > I thought copy command will do. There is a transformation, you are moving data to another table. That is overhead, especially if the triggers are not optimized. > > Before splitting the trigger into three, we tried > 1. Transferring data using DataWindow / PowerBuilder (that's the tool > we use to develop our front end). With the same single trigger, it took > few hours (more than 4 hours, exact time not noted down) to transfer the > same 86420 records. (Datawindow fires insert statements for every > row). Works, but the time taken is not acceptable. INSERTs by row is going to be slow, especially if the tool is doing a commit for each which I suspect it is. Check the Postgres logs. > > 2. Next, we split the larger csv file into 8, with each file containing > 10,000 records and the last one with 16420 records. Copy command > worked. Works, but the time taken to split the file not acceptable. We > wrote a batch file to split the larger csv file. We felt batch file is > easier to automate the whole process using PowerBuilder. I find most GUI tools create extra steps and overhead. My preference are simpler tools e.g. using Python csv module to batch/stream rows that the Python psycopg2 Postgres driver can insert or copy into the database. See: https://www.psycopg.org/psycopg3/docs/basic/copy.html > > 3. What we observed here, is insert statement succeeds and copy command > fails, if the records exceed a certain no. Haven't arrived the exact > number of rows when the copy command fails. > > Will do further works after my return from a holiday. > > Happiness Always > BKR Sivaprakash > > > > On Thursday 24 July, 2025 at 08:18:07 pm IST, Adrian Klaver > <adrian.klaver@aklaver.com> wrote: > > > On 7/24/25 05:18, sivapostgres@yahoo.com <mailto:sivapostgres@yahoo.com> > wrote: > > Thanks Merlin, adrain, Laurenz > > > > As a testcase, I split the trigger function into three, one each for > > insert, update, delete, each called from a separate trigger. > > > > IT WORKS!. > > It worked before, it just slowed down as your cases got bigger. You need > to provide more information on what test case you used and how you > define worked. > > > > > Shouldn't we have one trigger function for all the three trigger > > events? Is it prohibited for bulk insert like this? > > No. Triggers are overhead and they add to the processing that need to be > done for moving the data into the table. Whether that is an issue is a > case by case determination. > > > > > I tried this in PGAdmin only, will complete the testing from the program > > which we are developing, after my return from holiday. > > From Merlin Moncure's post: > > "* reconfiguring your logic to a procedure can be a better idea; COPY > your data into some staging tables (perhaps temp, and indexed), then > write to various tables with joins, upserts, etc." > > I would suggest looking into implementing the above. > > > > > > Happiness Always > > BKR Sivaprakash > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > -- Adrian Klaver adrian.klaver@aklaver.com