Re: performance issues for processing more then 150000

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: performance issues for processing more then 150000
Дата
Msg-id 200302231244.06242.josh@agliodbs.com
обсуждение исходный текст
Ответ на Re: performance issues for processing more then 150000  ("Jakab Laszlo" <jakablaszlo@sofasoft.ro>)
Список pgsql-performance
Jakab,

Some simple tips, which is what I thing you were really asking for:

Are you adding the records in a single overnight or downtime load batch?  If
so, the fastest way by far is to:
1) disable all triggers and constraints on the table temporarily, and some or
all of the indexes
2) put all the data into a COPY file (tab-delimited text; see COPY docs)
3) load the data through a COPY statement
4) re-enable the trigger and constraints and re-build the indexes
The degree to which you need to do 1) and 4) depends on how much punishment
your system can take; start out by dropping and rebuilding just the triggers
and move up from there until the load finishes in a satisfactory time.

If the records are being added on a continuous basis and not in a big batch,
then take the following precautions:
1) put as many inserts as possible into transaction batches
2) minimize your use of constraints, triggers, and indexes on the tables being
loaded
3) consdier using a "buffer table" to hold records about to be loaded while
data integrity checks and similar are performed.

> Unfortunatelly the hardware budget should be keept as low as possible.
> I was thinking is there could be reliable solution based on dual processor
> and ATA 133 raid mirroring normally with some gigs of memory.

1 gig of RAM may be plenty.  Your main bottleneck will be your disk channel.
If I were setting up your server, I might do something like:

1) buy a motherboard with 4 ATA controllers.
2) put disks like:
    channel 0: 1 matched pair disks
    channel 1 + 2: 1 matched quartet of disks
    channel 3: single ATA disk
    (for Postgresql, more, smaller disks is almost always better than a few big
ones.) (alternately, set up everythin in one big RAID 5 array with at least 6
disks.  There is argument about which is better)
3) Format the above as a RAID 1 pair on channel 0 and a RAID 1+0 double pair
on channel 1 using Linux software RAID
4) Put Linux OS + swap on channel 0.  Put the database on channel 1+2.  Put
the pg_xlog (the transaction log) on channel 3.  Make sure to use a version
of Linux with kernel 2.4.19 or greater!

That's just one configuration of several possible, of course, but may serve
your purposes admirably and relatively cheaply.

--
Josh Berkus
Aglio Database Solutions
San Francisco

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Really bad insert performance: what did I do wrong?
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: slow query