Обсуждение: Fast load

Поиск
Список
Период
Сортировка

Fast load

От
"Ted Rolle"
Дата:
We have 73 databases, two dozen with hundreds of thousands to millions of
records, with lengths in the 500-byte range.  I'm planning to convert them
from Btrieve to PostgreSQL.

Of course, I want the highest reasonable speed so that the conversion can be
completed - say - in a week-end.

My initial take is to create a tab-delimited file and use this as input to
the \copy command in psql.

Another option might be to feed them directly to the back end.

I'm a C programmer, so writing the necessary programs for the conversion and
load is not a problem.

Any pointers?

Not-wanting-to-go-down-a-garden-pathingly,

Ted


Re: Fast load

От
Jeremy Buchmann
Дата:
> We have 73 databases, two dozen with hundreds of thousands to millions of
> records, with lengths in the 500-byte range.  I'm planning to convert them
> from Btrieve to PostgreSQL.
>
> Of course, I want the highest reasonable speed so that the conversion can be
> completed - say - in a week-end.
>
> My initial take is to create a tab-delimited file and use this as input to
> the \copy command in psql.
>
> Another option might be to feed them directly to the back end.
>
> I'm a C programmer, so writing the necessary programs for the conversion and
> load is not a problem.
>
> Any pointers?

Use \copy.  It's very fast at importing tab-delimited text files.  Just make
sure that what you export from the old database is very "clean", meaning no
quotes around text fields, no tabs within a field, that sort of thing.  73
is a lot of databases, but it will probably take you more time to actually
type the \copy commands than it will for postgres to copy the data.

-- Jeremy  [jeremy@wellsgaming.com]



RE: Fast load

От
"Ted Rolle"
Дата:
Jeremy: Thanks for the quick reply.

That's REALLY encouraging!  We're converting them from Btrieve 6.0 to
Btrieve 7.0, and it's a real slow process for the big ones.

One of the larger Btrieve files blew up the conversion program (from the
vendor, no less).  The file has 1,296,418 446-byte records.
It's been running over two hours and it's at 916,200 records.  Every so
often it stops to take a big, long rest.  Stops for about 10 minutes, and
then merrily goes on its way for a while longer.  I'm surmising that this is
kswapd doing it's thing.



Re: Fast load

От
Peter Eisentraut
Дата:
Ted Rolle writes:

> We have 73 databases, two dozen with hundreds of thousands to millions of
> records, with lengths in the 500-byte range.  I'm planning to convert them
> >from Btrieve to PostgreSQL.
>
> Of course, I want the highest reasonable speed so that the conversion can be
> completed - say - in a week-end.

The fastest possible way to get data loaded into PostgreSQL is to create a
tab-delimited file and feed it directly to the backend with the COPY
command.  To speed things up even more, turn off fsync (-F), create the
indexes after loading, and the same with triggers, if you have any.  I'd
like to think that all of this should take significantly less than a
weekend. ;-)

Formatting the data into the right format for COPY can be done with your
favourite text mashing tools.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter