Обсуждение: Comments requested on IO performance : new db server
I've taken the liberty of reposting this message as my addendum to a long thread that I started on the subject of adding a new db server to our existing 4-year old workhorse got lost in discussion. Our workload is several small databases totalling less than 40GB of disk space. The proposed system has 48GB RAM, 2 * quad core E5620 @ 2.40GHz and 4 WD Raptors behind an LSI SAS card. Our supplier has just run a set of tests on the machine we intend to buy. The test rig had the following setup: LSI MegaRAID SAS 9260-8i Firmware: 12.12.0-0090 Kernel: 2.6.39.4 Hard disks: 4x WD6000BLHX Test done on 256GB volume BS = blocksize in bytes The test tool is fio. I'd be grateful to know if the results below are considered acceptable. An ancillary question is whether a 4096 block size is a good idea. I suppose we will be using XFS which I understand has a default block size of 4096 bytes. RAID 10 -------------------------------------- Read sequential BS MB/s IOPs 512 0129.26 264730.80 1024 0229.75 235273.40 4096 0363.14 092965.50 16384 0475.02 030401.50 65536 0472.79 007564.65 131072 0428.15 003425.20 -------------------------------------- Write sequential BS MB/s IOPs 512 0036.08 073908.00 1024 0065.61 067192.60 4096 0170.15 043560.40 16384 0219.80 014067.57 65536 0240.05 003840.91 131072 0243.96 001951.74 -------------------------------------- Random read BS MB/s IOPs 512 0001.50 003077.20 1024 0002.91 002981.40 4096 0011.59 002968.30 16384 0044.50 002848.28 65536 0156.96 002511.41 131072 0170.65 001365.25 -------------------------------------- Random write BS MB/s IOPs 512 0000.53 001103.60 1024 0001.15 001179.20 4096 0004.43 001135.30 16384 0017.61 001127.56 65536 0061.39 000982.39 131072 0079.27 000634.16 -------------------------------------- -- Rory Campbell-Lange rory@campbell-lange.net Campbell-Lange Workshop www.campbell-lange.net 0207 6311 555 3 Tottenham Street London W1T 2AF Registered in England No. 04551928
On Fri, Mar 9, 2012 at 5:15 AM, Rory Campbell-Lange <rory@campbell-lange.net> wrote: > I've taken the liberty of reposting this message as my addendum to a > long thread that I started on the subject of adding a new db server to > our existing 4-year old workhorse got lost in discussion. > > Our workload is several small databases totalling less than 40GB of disk > space. The proposed system has 48GB RAM, 2 * quad core E5620 @ 2.40GHz > and 4 WD Raptors behind an LSI SAS card. Our supplier has just run a set > of tests on the machine we intend to buy. The test rig had the following > setup: > > LSI MegaRAID SAS 9260-8i > Firmware: 12.12.0-0090 > Kernel: 2.6.39.4 > Hard disks: 4x WD6000BLHX > Test done on 256GB volume > BS = blocksize in bytes > > The test tool is fio. I'd be grateful to know if the results below are > considered acceptable. An ancillary question is whether a 4096 block > size is a good idea. I suppose we will be using XFS which I understand > has a default block size of 4096 bytes. > > RAID 10 > -------------------------------------- > Read sequential > > BS MB/s IOPs > 512 0129.26 264730.80 > 1024 0229.75 235273.40 > 4096 0363.14 092965.50 > 16384 0475.02 030401.50 > 65536 0472.79 007564.65 > 131072 0428.15 003425.20 > -------------------------------------- > Write sequential > > BS MB/s IOPs > 512 0036.08 073908.00 > 1024 0065.61 067192.60 > 4096 0170.15 043560.40 > 16384 0219.80 014067.57 > 65536 0240.05 003840.91 > 131072 0243.96 001951.74 > -------------------------------------- > Random read > > BS MB/s IOPs > 512 0001.50 003077.20 > 1024 0002.91 002981.40 > 4096 0011.59 002968.30 > 16384 0044.50 002848.28 > 65536 0156.96 002511.41 > 131072 0170.65 001365.25 > -------------------------------------- > Random write > > BS MB/s IOPs > 512 0000.53 001103.60 > 1024 0001.15 001179.20 > 4096 0004.43 001135.30 > 16384 0017.61 001127.56 > 65536 0061.39 000982.39 > 131072 0079.27 000634.16 > -------------------------------------- since your RAM is larger than the database size, read performance is essentially a non-issue. your major gating factors are going to be cpu bound queries and random writes -- 1000 IOPS essentially puts an upper bound on your write TPS, especially if your writes are frequent and randomly distributed, the case that is more or less simulated by pgbench with large scaling factors. Now, 1000 write tps is quite alot (3.6 mil transactions/hour) and your workload will drive the hardware consideration. merlin
On 09/03/12, Merlin Moncure (mmoncure@gmail.com) wrote: > On Fri, Mar 9, 2012 at 5:15 AM, Rory Campbell-Lange > <rory@campbell-lange.net> wrote: > > I've taken the liberty of reposting this message as my addendum to a > > long thread that I started on the subject of adding a new db server to > > our existing 4-year old workhorse got lost in discussion. > > > > Our workload is several small databases totalling less than 40GB of disk > > space. The proposed system has 48GB RAM, 2 * quad core E5620 @ 2.40GHz > > and 4 WD Raptors behind an LSI SAS card. Our supplier has just run a set > > of tests on the machine we intend to buy. The test rig had the following > > setup: > > > > LSI MegaRAID SAS 9260-8i > > Firmware: 12.12.0-0090 > > Kernel: 2.6.39.4 > > Hard disks: 4x WD6000BLHX > > Test done on 256GB volume > > BS = blocksize in bytes > > > > The test tool is fio. I'd be grateful to know if the results below are > > considered acceptable. An ancillary question is whether a 4096 block > > size is a good idea. I suppose we will be using XFS which I understand > > has a default block size of 4096 bytes. > > > > RAID 10 > > -------------------------------------- ... > > -------------------------------------- > > Random write > > > > BS MB/s IOPs > > 512 0000.53 001103.60 > > 1024 0001.15 001179.20 > > 4096 0004.43 001135.30 > > 16384 0017.61 001127.56 > > 65536 0061.39 000982.39 > > 131072 0079.27 000634.16 > > -------------------------------------- > > since your RAM is larger than the database size, read performance is > essentially a non-issue. your major gating factors are going to be > cpu bound queries and random writes -- 1000 IOPS essentially puts an > upper bound on your write TPS, especially if your writes are frequent > and randomly distributed, the case that is more or less simulated by > pgbench with large scaling factors. > > Now, 1000 write tps is quite alot (3.6 mil transactions/hour) and > your workload will drive the hardware consideration. Thanks for your comments, Merlin. With regard to the "gating factors" I believe the following is pertinent: CPU My current server has 2 * quad Xeon E5420 @ 2.50GHz. The server occasionally reaches 20% sutained utilisation according to sar. This cpu has a "passmark" of 7,730. http://www.cpubenchmark.net/cpu_lookup.php?cpu=[Dual+CPU]+Intel+Xeon+E5420+%40+2.50GHz My proposed CPU is an E5620 @ 2.40GHz with CPU "passmark" of 9,620 http://www.cpubenchmark.net/cpu_lookup.php?cpu=[Dual+CPU]+Intel+Xeon+E5620+%40+2.40GHz Since the workload will be very similar I'm hoping for about 20% better CPU performance from the new server, which should drop max CPU load by 5% or so. Random Writes I'll have to test this. My current server (R10 4*15K SCSI) produced the following pgbench stats while running its normal workload: -c -t TPS 5 20000 446 10 10000 542 20 5000 601 30 3333 647 I'd be grateful to know what parameters I should use for a "large scaling factor" pgbench test. Many thanks Rory -- Rory Campbell-Lange rory@campbell-lange.net Campbell-Lange Workshop www.campbell-lange.net 0207 6311 555 3 Tottenham Street London W1T 2AF Registered in England No. 04551928
Is a block size of 4096 a good idea both for the filesystem and postgresql? The analysis here: http://www.fuzzy.cz/en/articles/benchmark-results-hdd-read-write-pgbench/ appears to suggest that at least for database block sizes of 4096 read/write performance is much higher than for smaller block sizes. Rory On 09/03/12, Rory Campbell-Lange (rory@campbell-lange.net) wrote: > ...An ancillary question is whether a 4096 block size is a good idea. > I suppose we will be using XFS which I understand has a default block > size of 4096 bytes. > > RAID 10 > -------------------------------------- > Read sequential > > BS MB/s IOPs > 512 0129.26 264730.80 > 1024 0229.75 235273.40 > 4096 0363.14 092965.50 > 16384 0475.02 030401.50 > 65536 0472.79 007564.65 > 131072 0428.15 003425.20 > -------------------------------------- > Write sequential > > BS MB/s IOPs > 512 0036.08 073908.00 > 1024 0065.61 067192.60 > 4096 0170.15 043560.40 > 16384 0219.80 014067.57 > 65536 0240.05 003840.91 > 131072 0243.96 001951.74 > -------------------------------------- > Random read > > BS MB/s IOPs > 512 0001.50 003077.20 > 1024 0002.91 002981.40 > 4096 0011.59 002968.30 > 16384 0044.50 002848.28 > 65536 0156.96 002511.41 > 131072 0170.65 001365.25 > -------------------------------------- > Random write > > BS MB/s IOPs > 512 0000.53 001103.60 > 1024 0001.15 001179.20 > 4096 0004.43 001135.30 > 16384 0017.61 001127.56 > 65536 0061.39 000982.39 > 131072 0079.27 000634.16 > -------------------------------------- -- Rory Campbell-Lange rory@campbell-lange.net Campbell-Lange Workshop www.campbell-lange.net 0207 6311 555 3 Tottenham Street London W1T 2AF Registered in England No. 04551928
On 10.3.2012 11:51, Rory Campbell-Lange wrote: > Is a block size of 4096 a good idea both for the filesystem and > postgresql? The analysis here: > http://www.fuzzy.cz/en/articles/benchmark-results-hdd-read-write-pgbench/ > appears to suggest that at least for database block sizes of 4096 > read/write performance is much higher than for smaller block sizes. Hi, interpreting those results is a bit tricky for several reasons. First, those are 'average results' for all filesystems (and the behavior of filesystems may vary significantly). I'd recommend checking results for the filesystem you're going to use (http://www.fuzzy.cz/bench) Second, the article discusses just TPC-B (OLTP-like) workload results. It's quite probable your workload is going to mix that with other workload types (e.g. DSS/DWH). And that's exactly where larger block sizes are better. To me, 8kB seems like a good compromise. Don't use other block sizes unless you actually test the benefits for your workload. Tomas > > Rory > > On 09/03/12, Rory Campbell-Lange (rory@campbell-lange.net) wrote: >> ...An ancillary question is whether a 4096 block size is a good idea. >> I suppose we will be using XFS which I understand has a default block >> size of 4096 bytes. >> >> RAID 10 >> -------------------------------------- >> Read sequential >> >> BS MB/s IOPs >> 512 0129.26 264730.80 >> 1024 0229.75 235273.40 >> 4096 0363.14 092965.50 >> 16384 0475.02 030401.50 >> 65536 0472.79 007564.65 >> 131072 0428.15 003425.20 >> -------------------------------------- >> Write sequential >> >> BS MB/s IOPs >> 512 0036.08 073908.00 >> 1024 0065.61 067192.60 >> 4096 0170.15 043560.40 >> 16384 0219.80 014067.57 >> 65536 0240.05 003840.91 >> 131072 0243.96 001951.74 >> -------------------------------------- >> Random read >> >> BS MB/s IOPs >> 512 0001.50 003077.20 >> 1024 0002.91 002981.40 >> 4096 0011.59 002968.30 >> 16384 0044.50 002848.28 >> 65536 0156.96 002511.41 >> 131072 0170.65 001365.25 >> -------------------------------------- >> Random write >> >> BS MB/s IOPs >> 512 0000.53 001103.60 >> 1024 0001.15 001179.20 >> 4096 0004.43 001135.30 >> 16384 0017.61 001127.56 >> 65536 0061.39 000982.39 >> 131072 0079.27 000634.16 >> --------------------------------------