Re: How can I make use of both CPUs in a dual processor

Поиск
Список
Период
Сортировка
От Alex
Тема Re: How can I make use of both CPUs in a dual processor
Дата
Msg-id 420B5F22.2070407@meerkatsoft.com
обсуждение исходный текст
Ответ на Re: How can I make use of both CPUs in a dual processor  (John A Meinel <john@arbash-meinel.com>)
Список pgsql-performance
Thanks for all the suggestions. It seems that creating indices, or even
import data using a copy is easy to implement. I also have some jobs
that create reports and want to try if I gain anything if i work reports
in parallel.

will give it a try in the next week and let you know the resuls.

Alex

John A Meinel wrote:

> Alex wrote:
>
>> Thanks John.
>>
>> Well as I mentioned. I have a  Dual AMD Opteron 64 2.4ghz, 15k rpm
>> SCSI Disks, 4GB of memory.
>> Disks are pretty fast and memory should be more than enough.
>> Currently we dont have many concurrent connections.
>>
> Well, you didn't mention Opteron before (it makes a difference against
> Xeons).
> How many disks and in what configuration?
> Do you have pg_xlog on a separate set of disks?
> Are your drives in RAID 10 (0+1) or RAID 5?
>
> If you have enough disks the recommended configuration is at least a
> RAID1 for the OS, RAID 10 for pg_xlog (4drives), and RAID 10 (the rest
> of the drives) for the actual data.
>
> If your dataset is read heavy, or you have more than 6 disks, you can
> get away with RAID 5 for the actual data. But since you are talking
> about loading 8million rows at once, it certainly sounds like you are
> write heavy.
>
> If you only have a few disks, it's still probably better to put
> pg_xlog on it's own RAID1 (2-drive) mirror. pg_xlog is pretty much
> append only, so if you dedicate a disk set to it, you eliminate a lot
> of seek times.
>
>> I run PG 8.0.1 on Fedora Core 3
>>
>> When I now run the batch job, one CPU runs in the 80-90% the other in
>> 5-10% max.
>
>
>
> Anyway, it doesn't completely sound like you are CPU limited, but you
> might be able to get a little bit more if you spawn another process.
> Have you tried dropping the index, doing the copy, and then recreating
> the 4-indexes in separate processes?
>
> The simple test for this is to open 3-4 psql connections, have one of
> them drop the indexes and do the copy, in the other connections you
> can already have typed "CREATE INDEX ..." so when the copy is done and
> committed to the database, you just go to the other terminals and hit
> enter.
>
> Unfortunately you'll have to use wall clock time to see if this is
> faster.
>
> Though I think you could do the same thing with a bash script. The
> authentication should be in "trust" mode so that you don't take the
> time to type your password.
>
> #!/bin/bash
> psql -h <host> -c "DROP INDEX ...; COPY FROM ..."
>
> psql -h <host> -c "CREATE INDEX ..." &
> psql -h <host> -c "CREATE INDEX ..." &
> psql -h <host> -c "CREATE INDEX ..." &
> psql -h <host> -c "CREATE INDEX ..."
>
>
> Now, I don't really know how to wait for all child processes in a bash
> script (I could give you the python for it, but you're a perl guy).
> But by not spawning the last INDEX, I'm hoping it takes longer than
> the rest. Try to put the most difficult index there.
>
> Then you could just run
>
> time loadscript.sh
>
> I'm sure you could do the equivalent in perl. Just open multiple
> connections to the DB, and have them ready.
>
> I'm guessing since you are on a dual processor machine, you won't get
> much better performance above 2 connections.
>
> You can also try doing 2 COPYs at the same time, but it seems like you
> would have issues. Do you have any serial columns that you expect to
> be in a certain order, or is all the information in the copy?
>
> If the latter, try it, let us know what you get. I can't tell you the
> perl for this, since I'm not a perl guy.
>
> John
> =:->
>



В списке pgsql-performance по дате отправления:

Предыдущее
От: Jaime Casanova
Дата:
Сообщение: Benchmark
Следующее
От: Jeff
Дата:
Сообщение: Re: Benchmark