Обсуждение: postgresql multiple insert slow
Hello,
I've got a table in an oracle database with approx. 100000 records, that
I'd like to put into a table in a postgresql database. (This should be
done a couple of times per week)
I have written a short perl script, on a server that has remote access
to both the oracle database as well as the postgresql database. I am
running postgresql 7.4.1 on FreeBSD.
My perl script looks something like this:
[...]
my $sth2 = $cnx2->prepare('SELECT * FROM oracle_table');
my $res2 = $sth2->execute();
while(my($field2,$field5,$field6) = ($sth2->fetchrow_array)) {if(defined($field2)) { my $sth = $cnx->prepare('INSERT
INTO
the_pg_table(field1, field2) VALUES(?,?)'); my $result = $sth->execute($field2,$field5); $sth->finish;
}
}
[...]
I runs fine - and I get no errors - but it takes almost 25 minutes to
complete.. I tried running the script while just grabbing the rows from
the oracle database and writing to a text file - and then it only takes
a couple of minutes .. So it must be the INSERT command that chokes - is
there a better way to do it ?
Any advise much appreciated.
/mich
--
Best Regards,Michael L. Hostbaek
*/ PGP-key available upon request /*
"Michael L. Hostbaek" <mich@freebsdcluster.org> writes: > I runs fine - and I get no errors - but it takes almost 25 minutes to > complete.. I tried running the script while just grabbing the rows from > the oracle database and writing to a text file - and then it only takes > a couple of minutes .. So it must be the INSERT command that chokes - is > there a better way to do it ? You probably want to add BEGIN/COMMIT operations around the loop. See the documentation's tips on bulk data loading: http://www.postgresql.org/docs/7.4/static/populate.html regards, tom lane
Centuries ago, Nostradamus foresaw when mich@freebsdcluster.org ("Michael L. Hostbaek") would write:
> I runs fine - and I get no errors - but it takes almost 25 minutes to
> complete.. I tried running the script while just grabbing the rows from
> the oracle database and writing to a text file - and then it only takes
> a couple of minutes .. So it must be the INSERT command that chokes - is
> there a better way to do it ?
>
> Any advise much appreciated.
The problem here is that each insert implicitly sets up a transaction
BEGIN and COMMIT.
If you were to wrap the loop with a BEGIN at the start and a COMMIT at
the end, you would likely find it would complete much, much faster.
--
output = ("cbbrowne" "@" "acm.org")
http://www3.sympatico.ca/cbbrowne/wp.html
"Java and C++ make you think that the new ideas are like the old ones.
Java is the most distressing thing to hit computing since MS-DOS."
-- Alan Kay
In article <20040219163238.GD10913@mich2.itxmarket.com>,
"Michael L. Hostbaek" <mich@freebsdcluster.org> writes:
> Hello,
> I've got a table in an oracle database with approx. 100000 records, that
> I'd like to put into a table in a postgresql database. (This should be
> done a couple of times per week)
> I have written a short perl script, on a server that has remote access
> to both the oracle database as well as the postgresql database. I am
> running postgresql 7.4.1 on FreeBSD.
> My perl script looks something like this:
> [...]
> my $sth2 = $cnx2->prepare('SELECT * FROM oracle_table');
> my $res2 = $sth2->execute();
> while(my($field2,$field5,$field6) = ($sth2->fetchrow_array)) {
> if(defined($field2)) {
> my $sth = $cnx->prepare('INSERT INTO
> the_pg_table(field1, field2) VALUES(?,?)');
> my $result = $sth->execute($field2,$field5);
> $sth->finish;
> }
> }
> [...]
> I runs fine - and I get no errors - but it takes almost 25 minutes to
> complete.. I tried running the script while just grabbing the rows from
> the oracle database and writing to a text file - and then it only takes
> a couple of minutes .. So it must be the INSERT command that chokes - is
> there a better way to do it ?
First of all, you should prepare the insert statement only once,
outside of the loop. Then you could use fetchrow_arrarref instead of
fetchrow_array; this should eliminate a copy operation.
But the biggest win would be not to use INSERT at all. Instruct
Oracle to dump the rows into a CSV file, and then do just
$cnx->do ("COPY the_pg_table FROM 'csv.file'")