Обсуждение: What`s wrong with JFS configuration?
Hello! I have strange situation. I`m testing performance of PostgreSQL database at different filesystems (ext2,ex3,jfs) and I cant say that JFS is as much faster as it is said. My test look`s like that: Server: 2 x Xeon 2,4GHz 2GB ram 8 x HDD SCSI configured in RAID arrays like that: Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC ------------------------------------------------------------------------------ u0 RAID-10 OK - 64K 467.522 ON - - u6 RAID-1 OK - - 298.09 ON - - Port Status Unit Size Blocks Serial --------------------------------------------------------------- p0 OK u0 233.76 GB 490234752 Y634Y1DE p1 OK u0 233.76 GB 490234752 Y636TR9E p2 OK u0 233.76 GB 490234752 Y64VZF1E p3 OK u0 233.76 GB 490234752 Y64G8HRE p4 NOT-PRESENT - - - - p5 OK - 233.76 GB 490234752 Y63YMSNE p6 OK u6 298.09 GB 625142448 3QF08HFF p7 OK u6 298.09 GB 625142448 3QF08HHW where u6 stores Fedora Core 6 operating system, and u0 stores 3 partitions with ext2, ext3 and jfs filesystem. Postgresql 8.2 engine is intalled at system partition (u6 in raid) and run with data directory at diffrent FS partition for particular test. To test I use pgBench with default database schema, run for 25, 50, 75 users at one time. Every test I run 5 time to take average. Unfortunetly my result shows that ext is fastest, ext3 and jfs are very simillar. I can understand that ext2 without jurnaling is faster than ext3, it is said that jfs is 40 - 60% faster. I cant see the difference. Part of My results: (transaction type | scaling factor | num of clients | tpl | num on transactions | tps including connection time | tps excliding connection time) EXT2: TPC-B (sort of),50,75,13,975|975,338.286682,358.855582 TPC-B (sort of),50,75,133,9975|9975,126.777438,127.023687 TPC-B (sort of),50,75,1333,99975|99975,125.612325,125.636193 EXT3: TPC-B (sort of),50,75,13,975|975,226.139237,244.619009 TPC-B (sort of),50,75,133,9975|9975,88.678922,88.935371 TPC-B (sort of),50,75,1333,99975|99975,79.126892,79.147423 JFS: TPC-B (sort of),50,75,13,975|975,235.626369,255.863271 TPC-B (sort of),50,75,133,9975|9975,88.408323,88.664584 TPC-B (sort of),50,75,1333,99975|99975,81.003394,81.024297 Can anyone tell me what`s wrong with my test? Or maybe it is normal? Pawel Gruszczynski
Paweł Gruszczyński wrote: > To test I use pgBench with default database schema, run for 25, 50, 75 > users at one time. Every test I run 5 time to take average. > Unfortunetly my result shows that ext is fastest, ext3 and jfs are very > simillar. I can understand that ext2 without jurnaling is faster than > ext3, it is said that jfs is 40 - 60% faster. I cant see the difference. > Part of My results: (transaction type | scaling factor | num of clients > | tpl | num on transactions | tps including connection time | tps > excliding connection time) > > EXT2: > > TPC-B (sort of),50,75,13,975|975,338.286682,358.855582 > ... > > Can anyone tell me what`s wrong with my test? Or maybe it is normal? With a scaling factor of 50, your database size is ~ 1 GB, which fits comfortably in your RAM. You're not exercising your drives or filesystem much. Assuming you haven't disabled fsync, the performance of that test is bound by the speed your drives can flush WAL commit records to disk. I wouldn't expect the filesystem to make a big difference anyway, but you'll see.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 25-Apr-07, at 4:54 AM, Heikki Linnakangas wrote: > Paweł Gruszczyński wrote: >> To test I use pgBench with default database schema, run for 25, >> 50, 75 users at one time. Every test I run 5 time to take average. >> Unfortunetly my result shows that ext is fastest, ext3 and jfs are >> very simillar. I can understand that ext2 without jurnaling is >> faster than ext3, it is said that jfs is 40 - 60% faster. I cant >> see the difference. Part of My results: (transaction type | >> scaling factor | num of clients | tpl | num on transactions | tps >> including connection time | tps excliding connection time) >> EXT2: >> TPC-B (sort of),50,75,13,975|975,338.286682,358.855582 >> ... >> Can anyone tell me what`s wrong with my test? Or maybe it is normal? > > With a scaling factor of 50, your database size is ~ 1 GB, which > fits comfortably in your RAM. You're not exercising your drives or > filesystem much. Assuming you haven't disabled fsync, the > performance of that test is bound by the speed your drives can > flush WAL commit records to disk. > > I wouldn't expect the filesystem to make a big difference anyway, > but you'll see.. If you really believe that jfs is 40 -60% faster ( which I highly doubt ) you should see this by simply reading/writing a very large file (2x your memory size) with dd . Just curious but what data do you have that suggests this 40-60% number ? Dave > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate
Alexander Staubo napisał(a): > On 4/25/07, Paweł Gruszczyński <pawel.gruszczynski@inea.com.pl> wrote: >> I have strange situation. I`m testing performance of PostgreSQL database >> at different filesystems (ext2,ex3,jfs) and I cant say that JFS is as >> much faster as it is said. > > I don't know about 40-60% faster, but JFS is known to be a fast, good > file system -- faster than other file systems for some things, slower > for others. It's particularly known for putting a somewhat lower load > on the CPU than most other journaling file systems. > > Alexander. > I was just reading some informations on the web (for example: http://www.nabble.com/a-comparison-of-ext3,-jfs,-and-xfs-on-hardware-raid-t144738.html). My test should tell mi if it`s true, but now I see that rather everyhing is ok with my test method and the gain of using JFS is not so high. Pawel
pawel.gruszczynski@inea.com.pl (Paweł Gruszczyński) writes: > To test I use pgBench with default database schema, run for 25, 50, 75 > users at one time. Every test I run 5 time to take average. > Unfortunetly my result shows that ext is fastest, ext3 and jfs are > very simillar. I can understand that ext2 without jurnaling is faster > than ext3, it is said that jfs is 40 - 60% faster. I cant see the > difference. Part of My results: (transaction type | scaling factor | > num of clients | tpl | num on transactions | tps including connection > time | tps excliding connection time) > > EXT2: > > TPC-B (sort of),50,75,13,975|975,338.286682,358.855582 > TPC-B (sort of),50,75,133,9975|9975,126.777438,127.023687 > TPC-B (sort of),50,75,1333,99975|99975,125.612325,125.636193 > > EXT3: > > TPC-B (sort of),50,75,13,975|975,226.139237,244.619009 > TPC-B (sort of),50,75,133,9975|9975,88.678922,88.935371 > TPC-B (sort of),50,75,1333,99975|99975,79.126892,79.147423 > > JFS: > > TPC-B (sort of),50,75,13,975|975,235.626369,255.863271 > TPC-B (sort of),50,75,133,9975|9975,88.408323,88.664584 > TPC-B (sort of),50,75,1333,99975|99975,81.003394,81.024297 > > > Can anyone tell me what`s wrong with my test? Or maybe it is normal? For one thing, this test is *probably* staying mostly in memory. That will be skewing results away from measuring anything about the filesystem. When I did some testing of comparative Linux filesystem performance, back in 2003, I found that JFS was maybe 20% percent faster on a "write-only" workload than XFS, which was a few percent faster than ext3. The differences weren't terribly large. If you're seeing such huge differences with pgbench (which includes read load, which should be virtually unaffected by one's choice of filesystem), then I can only conclude that something about your testing methodology is magnifying the differences. -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://cbbrowne.com/info/oses.html "On the Internet, no one knows you're using Windows NT" -- Ramiro Estrugo, restrugo@fateware.com
On Wed, 25 Apr 2007, Pawe�~B Gruszczy�~Dski wrote: > I was just reading some informations on the web (for example: > http://www.nabble.com/a-comparison-of-ext3,-jfs,-and-xfs-on-hardware-raid-t144738.html). You were doing your tests with a database scale of 50. As Heikki already pointed out, that's pretty small (around 800MB) and you're mostly stressing parts of the system that may not change much based on filesystem choice. This is even more true when some of your tests are only using a small amount of transactions in a short period of time, which means just about everything could still be sitting in memory at the end of the test with the database disks barely used. In the example you reference above, a scaling factor of 1000 was used. This makes for a fairly large database of about 16GB. When running in that configuration, as stated he's mostly testing seek performance--you can't hold any significant portion of 16GB in memory, so you're always moving around the disks to find the data needed. It's a completely different type of test than what you did. If you want to try and replicate the filesystem differences shown on that page, start with the bonnie++ tests and see if you get similar results there. It's hard to predict whether you'll see the same differences given how different your RAID setup is from Jeff Baker's tests. It's not a quick trip from there to check if an improvement there holds up in database use that's like a real-world load. In addition to addressing the scaling factor issue, you'll need to so some basic PostgreSQL parameter tuning from the defaults, think about the impact of checkpoints on your test, and worry about whether your WAL I/O is being done efficiently before you get to the point where the database I/O is being measured usefully at all via pgbench. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote: > where u6 stores Fedora Core 6 operating system, and u0 stores 3 > partitions with ext2, ext3 and jfs filesystem. Keep in mind that drives have a faster data transfer rate at the outer-edge than they do at the inner edge, so if you've got all 3 filesystems sitting on that array at the same time it's not a fair test. I heard numbers on the impact of this a *long* time ago and I think it was in the 10% range, but I could be remembering wrong. You'll need to drop each filesystem and create the next one to get a fair comparison. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
The outer track / inner track performance ratio is more like 40 percent. Recent example is 78MB/s outer and 44MB/s inner for the new Seagate 750MB drive (see http://www.storagereview.com for benchmark results)
- Luke
Msg is shrt cuz m on ma treo
-----Original Message-----
From: Jim Nasby [mailto:decibel@decibel.org]
Sent: Thursday, April 26, 2007 03:53 AM Eastern Standard Time
To: Pawel Gruszczynski
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] What`s wrong with JFS configuration?
On Apr 25, 2007, at 8:51 AM, Pawel Gruszczynski wrote:
> where u6 stores Fedora Core 6 operating system, and u0 stores 3
> partitions with ext2, ext3 and jfs filesystem.
Keep in mind that drives have a faster data transfer rate at the
outer-edge than they do at the inner edge, so if you've got all 3
filesystems sitting on that array at the same time it's not a fair
test. I heard numbers on the impact of this a *long* time ago and I
think it was in the 10% range, but I could be remembering wrong.
You'll need to drop each filesystem and create the next one to get a
fair comparison.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Jim Nasby wrote: > On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote: >> where u6 stores Fedora Core 6 operating system, and u0 stores 3 >> partitions with ext2, ext3 and jfs filesystem. > > Keep in mind that drives have a faster data transfer rate at the > outer-edge than they do at the inner edge [...] I've been wondering from time to time if partitions position can be a (probably modest, of course) performance gain factor. If I create a partition at the beginning or end of the disk, is this going to have a determined platter physical position? I remember having heard that every manufacturer has its own allocation logic. Has anyone got some information, just for curiosity? -- Cosimo
Adding -performance back in so others can learn. On Apr 26, 2007, at 9:40 AM, Paweł Gruszczyński wrote: > Jim Nasby napisał(a): >> On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote: >>> where u6 stores Fedora Core 6 operating system, and u0 stores 3 >>> partitions with ext2, ext3 and jfs filesystem. >> >> Keep in mind that drives have a faster data transfer rate at the >> outer-edge than they do at the inner edge, so if you've got all 3 >> filesystems sitting on that array at the same time it's not a fair >> test. I heard numbers on the impact of this a *long* time ago and >> I think it was in the 10% range, but I could be remembering wrong. >> >> You'll need to drop each filesystem and create the next one go get >> a fair comparison. > > I thought about it by my situation is not so clear, becouse my hard > drive for postgresql data is rather "logical" becouse of RAID array > i mode 1+0. My RAID Array is divided like this: > > Device Boot Start End Blocks Id System > /dev/sda1 1 159850 163686384 83 Linux > /dev/sda2 159851 319431 163410944 83 Linux > /dev/sda3 319432 478742 163134464 83 Linux > > and partitions are: > > /dev/sda1 ext2 161117780 5781744 147151720 4% /fs/ext2 > /dev/sda2 ext3 160846452 2147848 150528060 2% /fs/ext3 > /dev/sda3 jfs 163096512 3913252 159183260 3% /fs/jfs > > so if RAID 1+0 do not change enything, JFS file system is at third > partition wich is at the end of hard drive. Yes, which means that JFS is going to be at a disadvantage to ext3, which will be at a disadvantage to ext2. You should really re-perform the tests with each filesystem in the same location. > What about HDD with two magnetic disk`s? Then the speed depending > of partition phisical location is more difficult to calculate ;) > Propably first is slow, secund is fast in firs halt and slow in > secund halt, third is the fastes one. In both cases my JFS partitin > should be ath the end on magnetic disk. Am I wrong? I'm not a HDD expert, but as far as I know the number of platters doesn't change anything. When you have multiple platters, the drive essentially splits bytes across all the platters; it doesn't start writing one platter, then switch to another platter. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)