Re: Improve COPY performance for large data sets

Поиск
Список
Период
Сортировка
От Bill Moran
Тема Re: Improve COPY performance for large data sets
Дата
Msg-id 20080910131606.cc18bed5.wmoran@collaborativefusion.com
обсуждение исходный текст
Ответ на Improve COPY performance for large data sets  (Ryan Hansen <ryan.hansen@brightbuilders.com>)
Ответы Re: Improve COPY performance for large data sets  (Dimitri Fontaine <dfontaine@hi-media.com>)
Re: Improve COPY performance for large data sets  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Список pgsql-performance
In response to Ryan Hansen <ryan.hansen@brightbuilders.com>:
>
> I'm relatively new to PostgreSQL but I've been in the IT applications
> industry for a long time, mostly in the LAMP world.
>
> One thing I'm experiencing some trouble with is running a COPY of a
> large file (20+ million records) into a table in a reasonable amount of
> time.  Currently it's taking about 12 hours to complete on a 64 bit
> server with 3 GB memory allocated (shared_buffer), single SATA 320 GB
> drive.  I don't seem to get any improvement running the same operation
> on a dual opteron dual-core, 16 GB server.
>
> I'm not asking for someone to solve my problem, just some direction in
> the best ways to tune for faster bulk loading, since this will be a
> fairly regular operation for our application (assuming it can work this
> way).  I've toyed with the maintenance_work_mem and some of the other
> params, but it's still way slower than it seems like it should be.
> So any contributions are much appreciated.

There's a program called pgloader which supposedly is faster than copy.
I've not used it so I can't say definitively how much faster it is.

A single 320G drive isn't going to get you much on speed.  How many
RPM?  Watch iostat on your platform to see if you're saturating the
drive, if you are, the only way you're going to get it faster is to
add more disks in a RAID-10 or similar, or somehow get a faster disk.

You always have the option to turn off fsync, but be sure you understand
the consequences of doing that and have an appropriate failure plan
before doing so.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

В списке pgsql-performance по дате отправления:

Предыдущее
От: Alan Hodgson
Дата:
Сообщение: Re: Improve COPY performance for large data sets
Следующее
От: Dimitri Fontaine
Дата:
Сообщение: Re: Improve COPY performance for large data sets