Обсуждение: 8xIntel S3500 SSD in RAID10 on Dell H710p

Поиск
Список
Период
Сортировка

8xIntel S3500 SSD in RAID10 on Dell H710p

От
Strahinja Kustudić
Дата:
I have a beast of a Dell server with the following specifications:
  • 4x Xeon E5-4657LV2 (48 cores total)
  • 196GB RAM
  • 2x SCSI 900GB in RAID1 (for the OS)
  • 8x Intel S3500 SSD 240GB in RAID10
  • H710p RAID controller, 1GB cache
Centos 6.6, RAID10 SSDs uses XFS (mkfs.xfs -i size=512 /dev/sdb).

Here are some relevant postgresql.conf settings:
shared_buffers = 8GB
work_mem = 64MB
maintenance_work_mem = 1GB
synchronous_commit = off
checkpoint_segments = 256
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
seq_page_cost = 1.0
effective_cache_size = 100GB

I ran some "fast" pgbench tests with 4, 6 and 8 drives in RAID10 and here are the results:

time /usr/pgsql-9.1/bin/pgbench -U postgres -i -s 12000 pgbench # 292GB DB

4 drives6 drives8 drives
105 min98 min94 min

/usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 -N pgbench   # Write test

4 drives6 drives8 drives
656774278073

/usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 pgbench  # Read/Write test

4 drives6 drives8 drives
365154747203

/usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 -S pgbench  # Read test

4 drives6 drives8 drives
176282548228698


A few notes:
  • I ran these tests only once, so take these number with reserve. I didn't have the time to run them more times, because I had to test how the server works with our app and it takes a considerable amount of time to run them all.
  • I wanted to use a bigger scale factor, but there is a bug in pgbench with big scale factors.
  • Postgres 9.1 was chosen, since the app which will run on this server uses 9.1.
  • These tests are with the H710p controller set to write-back (WB) and with adaptive read ahead (ADRA). I ran a few tests with write-through (WT) and no read ahead (NORA), but the results were worse.
  • All tests were run using 96 clients as recommended on the pgbench wiki page, but I'm sure I would get better results if I used 48 clients (1 for each core), which I tried with the R/W test and got 7986 on 8 drives, which is almost 800TPS better than with 96 clients.

Since our app is tied to the Postgres performance a lot, I'm currently trying to optimize it. Do you have any suggestions what Postgres/system settings I could try to tweak to increase performance? I have a feeling I could get more performance out of this system.


Regards,
Strahinja

Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
Mark Kirkwood
Дата:
On 10/12/14 12:28, Strahinja Kustudić wrote:

>   * These tests are with the H710p controller set to write-back (WB) and
>     with adaptive read ahead (ADRA). I ran a few tests with
>     write-through (WT) and no read ahead (NORA), but the results were worse.

That is interesting: I've done some testing on this type of card with 16
(slightly faster Hitachi) SSD attached. Setting WT and NORA should
enable the so-called 'fastpath' mode for the card [1]. We saw
performance improve markedly (300MB/s random write go to 1300MB/s).

This *might* be related to the fact that 16 SSD can put out more IOPS
than the card can actually handle - whereas your 8 S3500 is probably the
perfect number (e.g 8*11000 = 88000 which the card can handle ok).


[1] If you make the change while there are no outstanding background
operations (array rebuild etc) in progress (see
http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf).

Cheers

Mark


Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
Strahinja Kustudić
Дата:
On Wed, Dec 10, 2014 at 4:55 AM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
That is interesting: I've done some testing on this type of card with 16 (slightly faster Hitachi) SSD attached. Setting WT and NORA should enable the so-called 'fastpath' mode for the card [1]. We saw performance improve markedly (300MB/s random write go to 1300MB/s).

This *might* be related to the fact that 16 SSD can put out more IOPS than the card can actually handle - whereas your 8 S3500 is probably the perfect number (e.g 8*11000 = 88000 which the card can handle ok).


[1] If you make the change while there are no outstanding background operations (array rebuild etc) in progress (see http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf).

I read that guide too, which is the reason why I tried with WT/NORA, but the document also states:  "NOTE: RAID 10, RAID 50, and RAID 60 virtual disks cannot use FastPath." Which is a little odd, since usually if you want performance with reliability, you go RAID10.

Do you have any suggestions what I could try to tweak to get more performance?

Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
Mark Kirkwood
Дата:
On 10/12/14 21:30, Strahinja Kustudić wrote:
> On Wed, Dec 10, 2014 at 4:55 AM, Mark Kirkwood <
> mark.kirkwood@catalyst.net.nz> wrote:
>
>> That is interesting: I've done some testing on this type of card with 16
>> (slightly faster Hitachi) SSD attached. Setting WT and NORA should enable
>> the so-called 'fastpath' mode for the card [1]. We saw performance improve
>> markedly (300MB/s random write go to 1300MB/s).
>>
>> This *might* be related to the fact that 16 SSD can put out more IOPS than
>> the card can actually handle - whereas your 8 S3500 is probably the perfect
>> number (e.g 8*11000 = 88000 which the card can handle ok).
>>
>>
>> [1] If you make the change while there are no outstanding background
>> operations (array rebuild etc) in progress (see
>> http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf
>> ).
>
>
> I read that guide too, which is the reason why I tried with WT/NORA, but
> the document also states:  "NOTE: RAID 10, RAID 50, and RAID 60 virtual
> disks cannot use FastPath." Which is a little odd, since usually if you
> want performance with reliability, you go RAID10.
>
> Do you have any suggestions what I could try to tweak to get more
> performance?
>

We are using these configured as *individual* drives on RAID0 that are
then md raided in a (software) RAID 10 array. Maybe try that out (as
fastpath only cares about the HW RAID setup).

Interestingly we were also seeing better performance on a fully HW RAID
10 array with WT/NORA...so (I guess) our Hitachi SSD probably have lower
latency than the S3500 does.

Cheers

Mark


Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
Merlin Moncure
Дата:
On Wed, Dec 10, 2014 at 2:30 AM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:
> On Wed, Dec 10, 2014 at 4:55 AM, Mark Kirkwood
> <mark.kirkwood@catalyst.net.nz> wrote:
>>
>> That is interesting: I've done some testing on this type of card with 16
>> (slightly faster Hitachi) SSD attached. Setting WT and NORA should enable
>> the so-called 'fastpath' mode for the card [1]. We saw performance improve
>> markedly (300MB/s random write go to 1300MB/s).
>>
>> This *might* be related to the fact that 16 SSD can put out more IOPS than
>> the card can actually handle - whereas your 8 S3500 is probably the perfect
>> number (e.g 8*11000 = 88000 which the card can handle ok).
>>
>>
>> [1] If you make the change while there are no outstanding background
>> operations (array rebuild etc) in progress (see
>> http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf).
>
>
> I read that guide too, which is the reason why I tried with WT/NORA, but the
> document also states:  "NOTE: RAID 10, RAID 50, and RAID 60 virtual disks
> cannot use FastPath." Which is a little odd, since usually if you want
> performance with reliability, you go RAID10.
>
> Do you have any suggestions what I could try to tweak to get more
> performance?

Definitely crank effective_io_concurrency.  It will not help stock
pgbench test since it doesn't involve bitmap heap scans but when it
kicks in it's much faster.

http://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@mail.gmail.com

As it pertains to random read performance, I think you'll find that
you're getting pretty close to maxing out what the computer is
basically capable of -- I highly doubt you'll be read bound on storage
for any application; the classic techniques of optimizing queries,
indexes and tables is where focus your energy.   Sequential write will
also be no problem.

The only area where the s3500 falls short is random writes. If your
random write i/o requirements are extreme, you've bought the wrong
drive, I'd have shelled out for the S3700 (but it's never too late;
you can stack one on and move high write activity tables to the s3700
driven tablespace).

merlni


Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
"Graeme B. Bell"
Дата:
> 
> I have a beast of a Dell server with the following specifications:
>     • 4x Xeon E5-4657LV2 (48 cores total)
>     • 196GB RAM
>     • 2x SCSI 900GB in RAID1 (for the OS)
>     • 8x Intel S3500 SSD 240GB in RAID10
>     • H710p RAID controller, 1GB cache
> Centos 6.6, RAID10 SSDs uses XFS (mkfs.xfs -i size=512 /dev/sdb).

Things to check

- disk cache settings (EnDskCache - for SSD should be on or you're going to lose 90% of your performance)

- OS settings e.g. 

echo noop > /sys/block/sda/queue/scheduler
echo 975 > /sys/block/sda/queue/nr_requests
blockdev --setra 16384 /dev/sdb

- OS kernel version 

We use H710Ps with SSDs as well, and these settings make a measurable difference to our performance here (though we
measuremore than just pgbench since it's a poor proxy for our use cases).
 

Also

- SSDs - is the filesystem aligned and block size chosen correctly (you don't want to be forced to read 2 blocks of SSD
toget every data block)? RAID stripe size? May make a small difference. 
 

- are the SSDs all sitting on different SATA channels? You don't want them to be forced to share one channel's worth of
bandwidth.The H710P has 8 SATA channels I think (?) and you mention 10 devices above. 
 

Graeme Bell.

On 10 Dec 2014, at 00:28, Strahinja Kustudić <strahinjak@nordeus.com> wrote:

> I have a beast of a Dell server with the following specifications:
>     • 4x Xeon E5-4657LV2 (48 cores total)
>     • 196GB RAM
>     • 2x SCSI 900GB in RAID1 (for the OS)
>     • 8x Intel S3500 SSD 240GB in RAID10
>     • H710p RAID controller, 1GB cache
> Centos 6.6, RAID10 SSDs uses XFS (mkfs.xfs -i size=512 /dev/sdb).
> 
> Here are some relevant postgresql.conf settings:
> shared_buffers = 8GB
> work_mem = 64MB
> maintenance_work_mem = 1GB
> synchronous_commit = off
> checkpoint_segments = 256
> checkpoint_timeout = 10min
> checkpoint_completion_target = 0.9
> seq_page_cost = 1.0
> effective_cache_size = 100GB
> 
> I ran some "fast" pgbench tests with 4, 6 and 8 drives in RAID10 and here are the results:
> 
> time /usr/pgsql-9.1/bin/pgbench -U postgres -i -s 12000 pgbench # 292GB DB
> 
> 4 drives    6 drives    8 drives
> 105 min    98 min    94 min
> 
> /usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 -N pgbench   # Write test
> 
> 4 drives    6 drives    8 drives
> 6567    7427    8073
> 
> /usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 pgbench  # Read/Write test
> 
> 4 drives    6 drives    8 drives
> 3651    5474    7203
> 
> /usr/pgsql-9.1/bin/pgbench -U postgres -c 96 -T 600 -S pgbench  # Read test
> 
> 4 drives    6 drives    8 drives
> 17628    25482    28698
> 
> 
> A few notes:
>     • I ran these tests only once, so take these number with reserve. I didn't have the time to run them more times,
becauseI had to test how the server works with our app and it takes a considerable amount of time to run them all.
 
>     • I wanted to use a bigger scale factor, but there is a bug in pgbench with big scale factors.
>     • Postgres 9.1 was chosen, since the app which will run on this server uses 9.1.
>     • These tests are with the H710p controller set to write-back (WB) and with adaptive read ahead (ADRA). I ran a
fewtests with write-through (WT) and no read ahead (NORA), but the results were worse.
 
>     • All tests were run using 96 clients as recommended on the pgbench wiki page, but I'm sure I would get better
resultsif I used 48 clients (1 for each core), which I tried with the R/W test and got 7986 on 8 drives, which is
almost800TPS better than with 96 clients.
 
> 
> Since our app is tied to the Postgres performance a lot, I'm currently trying to optimize it. Do you have any
suggestionswhat Postgres/system settings I could try to tweak to increase performance? I have a feeling I could get
moreperformance out of this system.
 
> 
> 
> Regards,
> Strahinja


Re: 8xIntel S3500 SSD in RAID10 on Dell H710p

От
Strahinja Kustudić
Дата:
- disk cache settings (EnDskCache - for SSD should be on or you're going to lose 90% of your performance)

Disk cache is enabled, I know there is a huge performance impact.
 
- OS settings e.g.

echo noop > /sys/block/sda/queue/scheduler
echo 975 > /sys/block/sda/queue/nr_requests
blockdev --setra 16384 /dev/sdb

I'll try to play with these as well. I haven't tried noop yet, but it was on my checklist. 

 
- OS kernel version

We use H710Ps with SSDs as well, and these settings make a measurable difference to our performance here (though we measure more than just pgbench since it's a poor proxy for our use cases).

Also

- SSDs - is the filesystem aligned and block size chosen correctly (you don't want to be forced to read 2 blocks of SSD to get every data block)? RAID stripe size? May make a small difference.

I read a lot about filesystem alignment, but as far as I understand if I just format the whole drive and not created any partitions, all should be aligned. I'll check what is the data block size of SSDs and try set block size to that, but I'm not exactly sure how does block size work with RAID10, it seem logical to me that it works differently.
 

- are the SSDs all sitting on different SATA channels? You don't want them to be forced to share one channel's worth of bandwidth. The H710P has 8 SATA channels I think (?) and you mention 10 devices above.

Good question. Since the server is not physical at my place, I will have to check with the people who assembled it.


Thanks for your help. Once I do more testing, I'll return more details what helped.