Обсуждение: equivalent of sqlload?

Поиск
Список
Период
Сортировка

equivalent of sqlload?

От
John Holland
Дата:
Oracle has a facility for loading in massive data files, sqlload I think
is the name. I'd like to load some big data into PostgreSQL 6.4 for
testing purposes. Is there a way built in or would a perl or c hack be the
way to go?

john holland


Re: [GENERAL] equivalent of sqlload?

От
Herouth Maoz
Дата:
At 16:45 +0200 on 25/11/98, John Holland wrote:


> Oracle has a facility for loading in massive data files, sqlload I think
> is the name. I'd like to load some big data into PostgreSQL 6.4 for
> testing purposes. Is there a way built in or would a perl or c hack be the
> way to go?

Use the COPY command. It is preferable not to define indices (or to drop
the indices) on the table, then do the massive load with COPY, and then
define the indices.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma



Re: [GENERAL] equivalent of sqlload?

От
Sferacarta Software
Дата:
Hello John,

mercoledì, 25 novembre 98, you wrote:

JH> Oracle has a facility for loading in massive data files, sqlload I think
JH> is the name. I'd like to load some big data into PostgreSQL 6.4 for
JH> testing purposes. Is there a way built in or would a perl or c hack be the
JH> way to go?

JH> john holland

use:
       copy [binary] classname [with oids]
            to|from 'filename'|stdin|stdout
            [using delimiters 'delim']


or:
            \copy classname from filename


-Jose'-



Re: [GENERAL] equivalent of sqlload?

От
"Michael A. Koerber"
Дата:
I am running v6.3.2 under Linux and have found that the "copy" command
works only for small amounts of data.  When trying to "copy" several
thousand records I notice that system RAM and swap space continue to get
eaten until there is no further memory available.  "psql" then fails.
What remains is a .../pgdata/base/XYZ file system with the table being
copied into.  That table may be several (tens, hundreds) of Meg in size,
but a "psql -d XYS -c 'select count(*) table'" will only return a zero
count.

Now...I have broken the "copy" process down into smaller chunks and make
multiple calls to "copy".  I have a total of about 5.4 million records and
the job isn't done yet...my Pentium 433 has been working on this copy for
over 24 hours.

I don't know if there are any changes that can be made to speed this type
of process up, but this is definitely a black-mark.

mike

Dr Michael A. Koerber
MIT/Lincoln Laboratory
781-981-3250

On Wed, 25 Nov 1998, Herouth Maoz wrote:

> At 16:45 +0200 on 25/11/98, John Holland wrote:
>
>
> > Oracle has a facility for loading in massive data files, sqlload I think
> > is the name. I'd like to load some big data into PostgreSQL 6.4 for
> > testing purposes. Is there a way built in or would a perl or c hack be the
> > way to go?
>
> Use the COPY command. It is preferable not to define indices (or to drop
> the indices) on the table, then do the massive load with COPY, and then
> define the indices.
>
> Herouth
>
> --
> Herouth Maoz, Internet developer.
> Open University of Israel - Telem project
> http://telem.openu.ac.il/~herutma
>
>
>


Re: [GENERAL] equivalent of sqlload?

От
Herouth Maoz
Дата:
At 21:01 +0200 on 25/11/98, Michael A. Koerber wrote:


>
> Now...I have broken the "copy" process down into smaller chunks and make
> multiple calls to "copy".  I have a total of about 5.4 million records and
> the job isn't done yet...my Pentium 433 has been working on this copy for
> over 24 hours.

Try putting all the separate COPYs in one transaction.

BEGIN TRANSACTION;
COPY...
COPY...
COPY...
END;

Without the surrounding transaction, each copy is a transaction and
requires transaction overhead. I don't know how much time this will save,
though. Are you sure you dropped all the indices? When you declare a
primary key it declares a unique index, so watch out for that as well.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma



Re: [GENERAL] equivalent of sqlload?

От
Anton de Wet
Дата:
> > Now...I have broken the "copy" process down into smaller chunks and make
> > multiple calls to "copy".  I have a total of about 5.4 million records and
> > the job isn't done yet...my Pentium 433 has been working on this copy for
> > over 24 hours.


I have seen pretty much the same effect, to test this I set up script that
copied in 10000 records at a time. The time it took for each copy
increased linearly (starting at about 2.0 seconds on my PII-233) by approx
0.5 seconds per 10000. That means that by the time you have 5 million
records it takes more than 4 minutes to insert 10000 records.

Yes, I checked, there were NO indices defined on the table.

Anton