Re: Is there any limit on the number of rows to import using copy command

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Is there any limit on the number of rows to import using copy command
Дата
Msg-id e8fb9b99-40a2-41eb-8932-0959db8356c6@aklaver.com
обсуждение исходный текст
Ответ на Re: Is there any limit on the number of rows to import using copy command  ("sivapostgres@yahoo.com" <sivapostgres@yahoo.com>)
Список pgsql-general
On 7/24/25 16:59, sivapostgres@yahoo.com wrote:
> 1.  Testcase.  Created a new database, modified the triggers (split into 
> three), populated required master data, lookup tables.  Then transferred 
> 86420 records. Checked whether all the 86420 records inserted in table1 
> and also whether the trigger created the required records in table2.  
>   Yes, it created.
> 
> 2.  In the test case above, the total time taken to insert 86420 records 
> is 1.15 min only.   Earlier (before splitting the triggers) we waited 
> for more than 1.5 hrs first time and 2.5 hrs second time with no records 
> inserted.
> 
> 3.  Regarding moving the logic to procedure.  Won't the trigger work?  
> Will it be a burden for 86420 records?  It's working, if we insert few 
> thousand records.  After split of trigger function, it's working for 
> 86420 records.  Are triggers overhead for handling even 100000 records?  
> In production system, the same (single) trigger is working with 3 
> millions of records.  There might be better alternatives to triggers, 
> but triggers should also work.  IMHO.

Reread this post, in the thread, from Laurenz Albe:

https://www.postgresql.org/message-id/de08fd016dd9c630f65c52b80292550e0bcdea4c.camel%40cybertec.at

> 
> 4.  Staging tables.  Yes, I have done that in another case, where there 
> was a need to add data / transform for few more columns.  It worked like 
> a charm.  In this case, since there was no need for any other 
> calculations (transformation), and with just column to column matching, 
> I thought copy command will do.

There is a transformation, you are moving data to another table. That is 
overhead, especially if the triggers are not optimized.

> 
> Before splitting the trigger into three, we tried
> 1.  Transferring data using DataWindow / PowerBuilder (that's the tool 
> we use to develop our front end).  With the same single trigger, it took 
> few hours (more than 4 hours, exact time not noted down) to transfer the 
> same 86420 records.  (Datawindow fires insert statements for every 
> row).  Works, but the time taken is not acceptable.

INSERTs by row is going to be slow, especially if the tool is doing a 
commit for each which I suspect it is. Check the Postgres logs.

> 
> 2.  Next, we split the larger csv file into 8, with each file containing 
> 10,000 records and the last one with 16420 records.  Copy command 
> worked.  Works, but the time taken to split the file not acceptable.  We 
> wrote a batch file to split the larger csv file.  We felt batch file is 
> easier to automate the whole process using PowerBuilder.

I find most GUI tools create extra steps and overhead. My preference are 
simpler tools e.g. using Python csv module to batch/stream rows that the 
Python psycopg2 Postgres driver can insert or copy into the database.
See:

https://www.psycopg.org/psycopg3/docs/basic/copy.html

> 
> 3.  What we observed here, is insert statement succeeds and copy command 
> fails, if the records exceed a certain no.  Haven't arrived the exact 
> number of rows when the copy command fails.
> 
> Will do further works after my return from a holiday.
> 
> Happiness Always
> BKR Sivaprakash
> 
> 
> 
> On Thursday 24 July, 2025 at 08:18:07 pm IST, Adrian Klaver 
> <adrian.klaver@aklaver.com> wrote:
> 
> 
> On 7/24/25 05:18, sivapostgres@yahoo.com <mailto:sivapostgres@yahoo.com> 
> wrote:
>  > Thanks Merlin, adrain, Laurenz
>  >
>  > As a testcase, I split the trigger function into three, one each for
>  > insert, update, delete, each called from a separate trigger.
>  >
>  > IT WORKS!.
> 
> It worked before, it just slowed down as your cases got bigger. You need
> to provide more information on what test case you used and how you
> define worked.
> 
>  >
>  > Shouldn't we have one trigger function for all the three trigger
>  > events?  Is it prohibited for bulk insert like this?
> 
> No. Triggers are overhead and they add to the processing that need to be
> done for moving the data into the table. Whether that is an issue is a
> case by case determination.
> 
>  >
>  > I tried this in PGAdmin only, will complete the testing from the program
>  > which we are developing, after my return from holiday.
> 
>  From Merlin Moncure's post:
> 
> "* reconfiguring your logic to a procedure can be a better idea; COPY
> your data into some staging tables (perhaps temp, and indexed), then
> write to various tables with joins, upserts, etc."
> 
> I would suggest looking into implementing the above.
> 
> 
>  >
>  > Happiness Always
>  > BKR Sivaprakash
> 
>  >
> 
> 
> 
> -- 
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке pgsql-general по дате отправления: