Обсуждение: separate drives for WAL or pgdata files

Поиск
Список
Период
Сортировка

separate drives for WAL or pgdata files

От
"Anjan Dave"
Дата:

Hi,

 

I am not sure if there’s an obvious answer to this…If there’s a choice of an external RAID10 (Fiber Channel 6 or 8 15Krpm drives) enabled drives, what is more beneficial to store on it, the WAL, or the Database files? One of the other would go on the local RAID10 (4 drives, 15Krpm) along with the OS.

 

This is a very busy database with high concurrent connections, random reads and writes. Checkpoint segments are 300 and interval is 6 mins. Database size is less than 50GB.

 

It has become a bit more confusing because I am trying to allot shared storage across several hosts, and want to be careful not to overload one of the 2 storage processors.

 

What should I check/monitor if more information is needed to determine this?

 

Appreciate some suggestions.

 

Thanks,
Anjan

 

 
 
This email message and any included attachments constitute confidential and privileged information intended exclusively for the listed addressee(s). If you are not the intended recipient, please notify Vantage by immediately telephoning 215-579-8390, extension 1158.  In addition, please reply to this message confirming your receipt of the same in error.  A copy of your email reply can also be sent to mailto:support@vantage.com.  Please do not disclose, copy, distribute or take any action in reliance on the contents of this information.  Kindly destroy all copies of this message and any attachments.  Any other use of this email is prohibited.  Thank you for your cooperation.  For more information about Vantage, please visit our website at http://www.vantage.com.

 

Re: separate drives for WAL or pgdata files

От
David Lang
Дата:
On Mon, 19 Dec 2005, Anjan Dave wrote:

> I am not sure if there's an obvious answer to this...If there's a choice
> of an external RAID10 (Fiber Channel 6 or 8 15Krpm drives) enabled
> drives, what is more beneficial to store on it, the WAL, or the Database
> files? One of the other would go on the local RAID10 (4 drives, 15Krpm)
> along with the OS.

the WAL is small compared to the data, and it's mostly sequential access,
so it doesn't need many spindles, it just needs them more-or-less
dedicated to the WAL and not distracted by other things.

the data is large (by comparison), and is accessed randomly, so the more
spindles that you can throw at it the better.

In your place I would consider making the server's internal drives into
two raid1 pairs (one for the OS, one for the WAL), and then going with
raid10 on the external drives for your data

> This is a very busy database with high concurrent connections, random
> reads and writes. Checkpoint segments are 300 and interval is 6 mins.
> Database size is less than 50GB.

this is getting dangerously close to being able to fit in ram. I saw an
article over the weekend that Samsung is starting to produce 8G DIMM's,
that can go 8 to a controller (instead of 4 per as is currently done),
when motherboards come out that support this you can have 64G of ram per
opteron socket. it will be pricy, but the performance....

in the meantime you can already go 4G/slot * 4 slots/socket and get 64G on
a 4-socket system. it won't be cheap, but the performance will blow away
any disk-based system.

for persistant storage you can replicate from your ram-based system to a
disk-based system, and as long as your replication messages hit disk
quickly you can allow the disk-based version to lag behind in it's updates
during your peak periods (as long as it is able to catch up with the
writes overnight), and as the disk-based version won't have to do the
seeks for the reads it will be considerably faster then if it was doing
all the work (especially if you have good, large  battery-backed disk
caches to go with those drives to consolodate the writes)

> It has become a bit more confusing because I am trying to allot shared
> storage across several hosts, and want to be careful not to overload one
> of the 2 storage processors.

there's danger here, if you share spindles with other apps you run the
risk of slowing down your database significantly. you may be better off
with fewer, but dedicated drives rather then more, but shared drives.

David Lang


Re: separate drives for WAL or pgdata files

От
David Lang
Дата:
On Mon, 19 Dec 2005, David Lang wrote:

> this is getting dangerously close to being able to fit in ram. I saw an
> article over the weekend that Samsung is starting to produce 8G DIMM's, that
> can go 8 to a controller (instead of 4 per as is currently done), when
> motherboards come out that support this you can have 64G of ram per opteron
> socket. it will be pricy, but the performance....

a message on another mailing list got me to thinking, there is the horas
project that is aiming to put togeather 16 socket Opteron systems within a
year (they claim sooner, but I'm being pessimistic ;-), combine this with
these 8G dimms and you can have a SINGLE system with 1TB of ram on it
(right at the limits of the Opteron's 40 bit external memory addressing)

_wow_

and the thing it that it won't take much change in the software stack to
deal with this.

Linux is already running on machines with 1TB of ram (and 512 CPU's) so it
will run very well. Postgres probably needs some attention to it's locks,
but it is getting that attention now (and it will get more with the Sun
Niagra chips being able to run 8 processes simultaniously)

just think of the possibilities (if you have the money to afford the super
machine :-)

David Lang


Re: separate drives for WAL or pgdata files

От
"Jim C. Nasby"
Дата:
On Mon, Dec 19, 2005 at 07:20:56PM -0800, David Lang wrote:
> for persistant storage you can replicate from your ram-based system to a
> disk-based system, and as long as your replication messages hit disk
> quickly you can allow the disk-based version to lag behind in it's updates
> during your peak periods (as long as it is able to catch up with the
> writes overnight), and as the disk-based version won't have to do the
> seeks for the reads it will be considerably faster then if it was doing
> all the work (especially if you have good, large  battery-backed disk
> caches to go with those drives to consolodate the writes)

Huh? Unless you're doing a hell of a lot of writing just run a normal
instance and make sure you have enough bandwidth to the drives with
pg_xlog on it. Make sure those drives are using a battery-backed raid
controller too. You'll also need to tune things to make sure that
checkpoints never have much (if any) work to do when the occur, but you
should be able to set that up with proper bg_writer tuning.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461