Re: Importing Large Amounts of Data

Поиск

Список

Период

Сортировка

От	Curt Sampson
Тема	Re: Importing Large Amounts of Data
Дата	16 апреля 2002 г. 00:32:20
Msg-id	Pine.NEB.4.43.0204161017210.498-100000@angelic.cynic.net обсуждение исходный текст
Ответ на	Re: Importing Large Amounts of Data (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Importing Large Amounts of Data (Bruce Momjian <pgman@candle.pha.pa.us>) Re: Importing Large Amounts of Data (Gavin Sherry <swm@linuxworld.com.au>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, 15 Apr 2002, Tom Lane wrote:

> > I'm not looking for "runs a bit faster;" five percent either way
> > makes little difference to me. I'm looking for a five-fold performance
> > increase.
>
> You are not going to get it from this; where in the world did you get
> the notion that data integrity costs that much?

Um...the fact that MySQL imports the same data five times as fast? :-)

Note that this is *only* related to bulk-importing huge amounts of
data. Postgres seems a little bit slower than MySQL at building
the indicies afterwards, but this would be expected since (probably
due to higher tuple overhead) the size of the data once in postgres
is about 75% larger than in MySQL: 742 MB vs 420 MB. I've not done
any serious testing of query speed, but the bit of toying I've done
with it shows no major difference.

> Have you tried all the usual speedup hacks?  Turn off fsync, if you
> really think you do not care about crash integrity; use COPY FROM STDIN
> to bulk-load data, not retail INSERTs; possibly drop and recreate
> indexes rather than updating them piecemeal; etc.  You should also
> consider not declaring foreign keys, as the runtime checks for reference
> validity are pretty expensive.

Yes, I did all of the above. (This was all mentioned in my initial
message, except for turning off foreign key constraints--but the
table has no foreign keys.)

What I'm thinking would be really cool would be to have an "offline"
way of creating tables using a stand-alone program that would write
the files at, one hopes, near disk speed. Maybe it could work by
creating the tables in a detached tablespace, and then you'd attach
the tablespace when you're done. It might even be extended to be
able to do foreign key checks, create indicies, and so on. (Foreign
key checks would be useful; I'm not sure that creating indicies
would be any faster than just doing it after the tablespace is
attached.)

This would be particularly useful for fast restores of backups.
Downtime while doing a restore is always a huge pain for large
databases.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tatsuo Ishii
Дата: 16 апреля 2002 г., 00:21:45
Сообщение: multibyte support by default

Следующее

От: Bruce Momjian
Дата: 16 апреля 2002 г., 00:44:48
Сообщение: Re: Importing Large Amounts of Data

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Importing Large Amounts of Data

Предыдущее

Следующее