Better way to bulk-load millions of CSV records into postgres?
От | Ron Johnson |
---|---|
Тема | Better way to bulk-load millions of CSV records into postgres? |
Дата | |
Msg-id | 1022013600.16609.61.camel@rebel обсуждение исходный текст |
Ответы |
Re: Better way to bulk-load millions of CSV records into postgres?
Re: Better way to bulk-load millions of CSV records into postgres? |
Список | pgsql-novice |
Hi, Currently, I've got a python script using pyPgSQL that parses the CSV record, creates a string that is a big "INSERT INTO VALUES (...)" command, then, execute() it. top shows that this method uses postmaster with ~70% CPU utilization, and python with ~15% utilization. Still, it's only inserting ~190 recs/second. Is there a better way to do this, or am I constrained by the hardware? Instead of python and postmaster having to do a ton of data xfer over sockets, I'm wondering if there's a way to send a large number of csv records (4000, for example) in one big chunk to a stored procedure and have the engine process it all. Linux 2.4.18 PostgreSQL 7.2.1 python 2.1.3 csv file on /dev/hda table on /dev/hde (ATA/100) -- +---------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://ronandheather.dhs.org:81 | | | | "I have created a government of whirled peas..." | | Maharishi Mahesh Yogi, 12-May-2002, | ! CNN, Larry King Live | +---------------------------------------------------------+
В списке pgsql-novice по дате отправления: