Обсуждение: Insert slow down on empty database

Поиск
Список
Период
Сортировка

Insert slow down on empty database

От
"Morgan"
Дата:
Hi,

I am having a problem with inserting  a large amount of data with my libpqxx
program into an initially empty database. It appears to be the EXACT same
problem discussed here:

http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php

In fact my situation is nearly identical, with roughly 5 major tables, with
foreign keys between each other. All the tables are being loaded into
similtaneously with about 2-3 million rows each. It seems that the problem
is caused by the fact that I am using prepared statments, that cause the
query planner to choose sequential scans for the foreign key checks due to
the table being initially empty.  As with the post above, if I dump my
connection after about 4000 inserts, and restablish it the inserts speed up
by a couple of orders of magnitude and remain realtively constant through
the whole insertion.

At first I was using straight insert statments, and although they were a bit
slower than the prepared statments(after the restablished connection) they
never ran into this problem with the database being initially empty. I only
changed to the prepared statements because it was suggested in the
documentation for advice on bulk data loads =).

I can work around this problem, and I am sure somebody is working on fixing
this, but I thought it might be good to reaffirm the problem.

Thanks,
Morgan Kita



Re: Insert slow down on empty database

От
Christopher Browne
Дата:
In an attempt to throw the authorities off his trail, "Morgan" <mkita@verseon.com> transmitted:
> At first I was using straight insert statments, and although they
> were a bit slower than the prepared statments(after the restablished
> connection) they never ran into this problem with the database being
> initially empty. I only changed to the prepared statements because
> it was suggested in the documentation for advice on bulk data loads
> =).

I remember encountering this with Oracle, and the answer being "do
some loading 'til it slows down, then update statistics and restart."

I don't know that there's an obvious alternative outside of perhaps
some variation on pg_autovacuum...
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://linuxdatabases.info/info/spreadsheets.html
So long and thanks for all the fish.