Обсуждение: Best way to import data in postgresl (not "COPY")
Hello, I have a system that must each day import lots of data from another one. Our system is in postgresql and we connect to the other via ODBC. Currently we do something like : SELECT ... FROM ODBC source foreach row { INSERT INTO postgresql } The problem is that this method is very slow... Does someone has a better suggestion ? Thanks a lot in advance ! Denis
Denis BUCHER wrote: > Hello, > > I have a system that must each day import lots of data from another one. > Our system is in postgresql and we connect to the other via ODBC. > > Currently we do something like : > > SELECT ... FROM ODBC source > foreach row { > INSERT INTO postgresql > } > > The problem is that this method is very slow... > > Does someone has a better suggestion ? > > Thanks a lot in advance ! > > Denis > If you can prepare your statement it would run a lot faster, no idea if odbc supports such things though. so: select ... from odbc...; $q = prepare('insert into pg...') foreach row { $q.params[0] = .. $q.params[1] = .. $q.execute; } commit; (* if possible, make sure you are not commitiing each insert statement, do them all the commit once at the end *) If you cant prepare, you should try to build multi-value insert statements: insert into pgtable (col1, col2, col3) values ('a', 'b', 'c'), ('d', 'e', 'f'), ('g','h','i'),...; Or, you could look into dblink, dunno if it would be faster. -Andy
On Wed, Jul 22, 2009 at 08:24:22PM +0200, Denis BUCHER wrote: > SELECT ... FROM ODBC source > foreach row { > INSERT INTO postgresql > } > > The problem is that this method is very slow... > > Does someone has a better suggestion ? Using COPY[1] is normally the preferred solution to getting data into PG fast. Some languages make this easier than others, if you can generate SQL that looks like: COPY table (col1,col2) FROM STDIN WITH CSV; 13,hello 42,"text with,comma" \. then you should be in luck---just bung this off to the ODBC driver as is and all should good. If you need to copy more than will fit in a string, arrange to put a few thousand rows in each batch, and generate them and insert them one-at-a-time in a transaction. Using tab-delimited mode (drop the WITH CSV) is possible, but most languages will provide library code for generating CSV files and hence will probably be easier to get correct. -- Sam http://samason.me.uk/ [1] http://www.postgresql.org/docs/current/static/sql-copy.html
Hello everyone, Denis BUCHER a écrit : > I have a system that must each day import lots of data from another one. > Our system is in postgresql and we connect to the other via ODBC. > > Currently we do something like : > > SELECT ... FROM ODBC source > foreach row { > INSERT INTO postgresql > } > > The problem is that this method is very slow... > Does someone has a better suggestion ? Thanks a lot for the help of everyone ! There are the first results of my tries, it's very interesting !!! a) ON THE DESTINATION (PHP/Postgresql) 1. Preparing INSERT statements (to Postgres) was already a better idea 2. Then using BEGIN WORK COMMIT improved even more 3. At first I didn't realised I could remove quotes escaping thank to prepare, this improved a little more 4. Then I found something very interesting : pg_send_execute ! (asynchronous) Inserted lines : 134297 Required time : 292 seconds ([0] without prepare) Required time : 253 seconds ([1] with prepare) (13% better) Required time : 224 seconds ([2] with prepare and BEGIN COMMIT) (12% better) Required time : 221 seconds [3]removed escaping Required time : 214 seconds ([4] 4% better) b) ON THE SOURCE (PHP/ODBC) 5. Believe it or not but changing from PHP ODBC to PHP PDO ODBC From : http://us2.php.net/manual/en/ref.uodbc.php To : http://fr.php.net/manual/en/class.pdostatement.php ...helped a LOT : Inserted lines : 134297 Required time : 25 seconds ([1] [2] [3] [4] [5] + PDO) Hope it will help other people ! Thanks a lot again to everyone that help me :-) Denis