Обсуждение: Really bad diskio
Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Currently the database is only 16G with about 2 tables with 500000+ row, one table 200000+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron
Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? RAID 1? 0? 10? 5? Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? > > Thanks for and help in advance. > > Ron > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? Can you please characterize this a bit better? Send the output of vmstat or iostat over several minutes, or similar diagnostic information. Also please describe your hardware more. Regards, Jeff Baker
At Fri, 15 Jul 2005 13:45:07 -0700, Joshua D. Drake wrote: > > Ron Wills wrote: > > Hello all > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > an 3Ware SATA raid. > > 2 drives? > 4 drives? > 8 drives? 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. I was also was thinking this could be from the transactional subsystem getting overloaded? There are several automated processes that use the DB. Most are just selects, but the data updates and one that updates the smaller tables that are the heavier queries. On their own they seem to work ok, (still high IO, but fairly quick). But if even the simplest select is called during the heavier operation, then everything goes out through the roof. Maybe there's something I'm missing here as well? > RAID 1? 0? 10? 5? > > > Currently the database is only 16G with about 2 > > tables with 500000+ row, one table 200000+ row and a few small > > tables. The larger tables get updated about every two hours. The > > problem I having with this server (which is in production) is the disk > > IO. On the larger tables I'm getting disk IO wait averages of > > ~70-90%. I've been tweaking the linux kernel as specified in the > > PostgreSQL documentations and switched to the deadline > > scheduler. Nothing seems to be fixing this. The queries are as > > optimized as I can get them. fsync is off in an attempt to help > > preformance still nothing. Are there any setting I should be look at > > the could improve on this??? > > > > Thanks for and help in advance. > > > > Ron > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > -- > Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 > PostgreSQL Replication, Consulting, Custom Programming, 24x7 support > Managed Services, Shared and Dedicated Hosting > Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Operating System? Which file system are you using? I was having a similar problem just a few days ago and learned that ext3 was the culprit. -Dan
On Fri, Jul 15, 2005 at 03:04:35PM -0600, Ron Wills wrote: > > > Ron Wills wrote: > > > Hello all > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > an 3Ware SATA raid. > > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > moving the database to the mirrors software raid where the root is > found and onto the the SATA raid. Neither relieved the IO problems. What filesystem is this? -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) Si no sabes adonde vas, es muy probable que acabes en otra parte.
On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: > At Fri, 15 Jul 2005 13:45:07 -0700, > Joshua D. Drake wrote: > > > > Ron Wills wrote: > > > Hello all > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > an 3Ware SATA raid. > > > > 2 drives? > > 4 drives? > > 8 drives? > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > moving the database to the mirrors software raid where the root is > found and onto the the SATA raid. Neither relieved the IO problems. Hard or soft RAID? Which controller? Many of the 3Ware controllers (85xx and 95xx) have extremely bad RAID 5 performance. Did you take any pgbench or other benchmark figures before you started using the DB? -jwb
At Fri, 15 Jul 2005 14:00:07 -0700,
Jeffrey W. Baker wrote:
>
> On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote:
> > Hello all
> >
> > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > an 3Ware SATA raid. Currently the database is only 16G with about 2
> > tables with 500000+ row, one table 200000+ row and a few small
> > tables. The larger tables get updated about every two hours. The
> > problem I having with this server (which is in production) is the disk
> > IO. On the larger tables I'm getting disk IO wait averages of
> > ~70-90%. I've been tweaking the linux kernel as specified in the
> > PostgreSQL documentations and switched to the deadline
> > scheduler. Nothing seems to be fixing this. The queries are as
> > optimized as I can get them. fsync is off in an attempt to help
> > preformance still nothing. Are there any setting I should be look at
> > the could improve on this???
>
> Can you please characterize this a bit better? Send the output of
> vmstat or iostat over several minutes, or similar diagnostic
> information.
>
> Also please describe your hardware more.
Here's a bit of a dump of the system that should be useful.
Processors x2:
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon(tm) MP 2400+
stepping : 1
cpu MHz : 2000.474
cache size : 256 KB
MemTotal: 903804 kB
Mandrake 10.0 Linux kernel 2.6.3-19mdk
The raid controller, which is using the hardware raid configuration:
3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
scsi0 : 3ware 9000 Storage Controller
3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17.
3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4.
Vendor: 3ware Model: Logical Disk 00 Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 00
SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB)
SCSI device sda: drive cache: write back, no read (daft)
This is also on a 3.6 reiser filesystem.
Here's the iostat for 10mins every 10secs. I've removed the stats from
the idle drives to reduce the size of this email.
Linux 2.6.3-19mdksmp (photo_server) 07/15/2005
avg-cpu: %user %nice %sys %iowait %idle
2.85 1.53 2.15 39.52 53.95
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 82.49 4501.73 188.38 1818836580 76110154
avg-cpu: %user %nice %sys %iowait %idle
0.30 0.00 1.00 96.30 2.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 87.80 6159.20 340.00 61592 3400
avg-cpu: %user %nice %sys %iowait %idle
2.50 0.00 1.45 94.35 1.70
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 89.60 5402.40 320.80 54024 3208
avg-cpu: %user %nice %sys %iowait %idle
1.00 0.10 1.35 97.55 0.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 105.20 5626.40 332.80 56264 3328
avg-cpu: %user %nice %sys %iowait %idle
0.40 0.00 1.00 87.40 11.20
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 92.61 4484.32 515.48 44888 5160
avg-cpu: %user %nice %sys %iowait %idle
0.45 0.00 1.00 92.66 5.89
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 89.10 4596.00 225.60 45960 2256
avg-cpu: %user %nice %sys %iowait %idle
0.30 0.00 0.80 96.30 2.60
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 86.49 3877.48 414.01 38736 4136
avg-cpu: %user %nice %sys %iowait %idle
0.50 0.00 1.00 98.15 0.35
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 97.10 4710.49 405.19 47152 4056
avg-cpu: %user %nice %sys %iowait %idle
0.35 0.00 1.00 98.65 0.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 93.30 5324.80 186.40 53248 1864
avg-cpu: %user %nice %sys %iowait %idle
0.40 0.00 1.10 96.70 1.80
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 117.88 5481.72 402.80 54872 4032
avg-cpu: %user %nice %sys %iowait %idle
0.50 0.00 1.05 98.30 0.15
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 124.00 6081.60 403.20 60816 4032
avg-cpu: %user %nice %sys %iowait %idle
8.75 0.00 2.55 84.46 4.25
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 125.20 5609.60 228.80 56096 2288
avg-cpu: %user %nice %sys %iowait %idle
2.25 0.00 1.30 96.00 0.45
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 176.98 6166.17 686.29 61600 6856
avg-cpu: %user %nice %sys %iowait %idle
5.95 0.00 2.25 88.09 3.70
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 154.55 7879.32 295.70 78872 2960
avg-cpu: %user %nice %sys %iowait %idle
10.29 0.00 3.40 81.97 4.35
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 213.19 11422.18 557.84 114336 5584
avg-cpu: %user %nice %sys %iowait %idle
1.90 0.10 3.25 94.75 0.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 227.80 12330.40 212.80 123304 2128
avg-cpu: %user %nice %sys %iowait %idle
0.55 0.00 0.85 96.80 1.80
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 96.30 3464.80 568.80 34648 5688
avg-cpu: %user %nice %sys %iowait %idle
0.70 0.00 1.10 97.25 0.95
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 92.60 4989.60 237.60 49896 2376
avg-cpu: %user %nice %sys %iowait %idle
2.75 0.00 2.10 93.55 1.60
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 198.40 10031.63 458.86 100216 4584
avg-cpu: %user %nice %sys %iowait %idle
0.65 0.00 2.40 95.90 1.05
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 250.25 14174.63 231.77 141888 2320
avg-cpu: %user %nice %sys %iowait %idle
0.60 0.00 2.15 97.20 0.05
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 285.50 12127.20 423.20 121272 4232
avg-cpu: %user %nice %sys %iowait %idle
0.60 0.00 2.90 95.65 0.85
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 393.70 14383.20 534.40 143832 5344
avg-cpu: %user %nice %sys %iowait %idle
0.55 0.00 2.15 96.15 1.15
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 252.15 11801.80 246.15 118136 2464
avg-cpu: %user %nice %sys %iowait %idle
0.75 0.00 3.45 95.15 0.65
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 396.00 19980.80 261.60 199808 2616
avg-cpu: %user %nice %sys %iowait %idle
0.70 0.00 2.70 95.70 0.90
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 286.20 14182.40 467.20 141824 4672
avg-cpu: %user %nice %sys %iowait %idle
0.70 0.00 2.70 95.65 0.95
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 344.20 15838.40 473.60 158384 4736
avg-cpu: %user %nice %sys %iowait %idle
0.75 0.00 1.70 97.50 0.05
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 178.72 7495.70 412.39 75032 4128
avg-cpu: %user %nice %sys %iowait %idle
1.05 0.05 1.30 97.05 0.55
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 107.89 4334.87 249.35 43392 2496
avg-cpu: %user %nice %sys %iowait %idle
0.55 0.00 1.30 98.10 0.05
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 107.01 6345.55 321.12 63392 3208
avg-cpu: %user %nice %sys %iowait %idle
0.65 0.00 1.05 97.55 0.75
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 107.79 3908.89 464.34 39128 4648
avg-cpu: %user %nice %sys %iowait %idle
0.50 0.00 1.15 97.75 0.60
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 109.21 4162.56 434.83 41584 4344
avg-cpu: %user %nice %sys %iowait %idle
0.75 0.00 1.15 98.00 0.10
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 104.19 4796.81 211.58 48064 2120
avg-cpu: %user %nice %sys %iowait %idle
0.70 0.00 1.05 97.85 0.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 105.50 4690.40 429.60 46904 4296
avg-cpu: %user %nice %sys %iowait %idle
0.75 0.00 1.10 98.15 0.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 107.51 4525.33 357.96 45208 3576
avg-cpu: %user %nice %sys %iowait %idle
2.80 0.00 1.65 92.81 2.75
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 123.18 3810.59 512.29 38144 5128
avg-cpu: %user %nice %sys %iowait %idle
0.60 0.00 1.05 97.10 1.25
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 104.60 3780.00 236.00 37800 2360
avg-cpu: %user %nice %sys %iowait %idle
0.70 0.00 1.10 95.96 2.25
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 117.08 3817.78 466.73 38216 4672
avg-cpu: %user %nice %sys %iowait %idle
0.65 0.00 0.90 96.65 1.80
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 117.20 3629.60 477.60 36296 4776
avg-cpu: %user %nice %sys %iowait %idle
0.80 0.00 1.10 97.50 0.60
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 112.79 4258.94 326.07 42632 3264
avg-cpu: %user %nice %sys %iowait %idle
1.05 0.15 1.20 97.50 0.10
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 125.83 2592.99 522.12 25904 5216
avg-cpu: %user %nice %sys %iowait %idle
0.60 0.00 0.55 98.20 0.65
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 104.90 823.98 305.29 8248 3056
avg-cpu: %user %nice %sys %iowait %idle
0.50 0.00 0.65 98.75 0.10
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 109.80 734.40 468.80 7344 4688
avg-cpu: %user %nice %sys %iowait %idle
1.15 0.00 1.05 97.75 0.05
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 107.70 751.20 463.20 7512 4632
avg-cpu: %user %nice %sys %iowait %idle
6.50 0.00 1.85 90.25 1.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 98.00 739.14 277.08 7384 2768
avg-cpu: %user %nice %sys %iowait %idle
0.20 0.00 0.40 82.75 16.65
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 83.13 550.90 360.08 5520 3608
avg-cpu: %user %nice %sys %iowait %idle
2.65 0.30 2.15 82.91 11.99
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 100.00 1136.46 503.50 11376 5040
avg-cpu: %user %nice %sys %iowait %idle
1.00 6.25 2.15 89.70 0.90
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 170.17 4106.51 388.39 41024 3880
avg-cpu: %user %nice %sys %iowait %idle
0.75 0.15 1.75 73.70 23.65
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 234.60 5107.20 232.80 51072 2328
avg-cpu: %user %nice %sys %iowait %idle
0.15 0.00 0.65 49.48 49.73
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 175.52 1431.37 122.28 14328 1224
avg-cpu: %user %nice %sys %iowait %idle
0.15 0.00 0.55 50.22 49.08
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 173.50 1464.00 119.20 14640 1192
avg-cpu: %user %nice %sys %iowait %idle
2.00 0.00 0.60 76.18 21.22
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 130.60 1044.80 203.20 10448 2032
avg-cpu: %user %nice %sys %iowait %idle
0.90 0.10 0.75 97.55 0.70
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 92.09 1024.22 197.80 10232 1976
avg-cpu: %user %nice %sys %iowait %idle
0.25 0.00 0.40 73.78 25.57
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 92.81 582.83 506.99 5840 5080
avg-cpu: %user %nice %sys %iowait %idle
0.20 0.00 0.55 98.85 0.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 90.80 657.60 383.20 6576 3832
avg-cpu: %user %nice %sys %iowait %idle
16.46 0.00 4.25 77.09 2.20
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 99.60 1174.83 549.85 11760 5504
avg-cpu: %user %nice %sys %iowait %idle
8.05 0.00 2.60 56.92 32.43
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 172.30 2063.20 128.00 20632 1280
avg-cpu: %user %nice %sys %iowait %idle
20.84 0.00 4.75 52.82 21.59
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 174.30 1416.80 484.00 14168 4840
avg-cpu: %user %nice %sys %iowait %idle
1.30 0.00 1.60 56.93 40.17
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 181.02 2858.74 418.78 28616 4192
avg-cpu: %user %nice %sys %iowait %idle
19.17 0.00 4.44 49.78 26.61
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 162.20 1286.40 373.60 12864 3736
avg-cpu: %user %nice %sys %iowait %idle
0.15 0.00 0.60 50.85 48.40
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 178.08 1436.64 97.70 14352 976
> Regards,
> Jeff Baker
At Fri, 15 Jul 2005 14:17:34 -0700, Jeffrey W. Baker wrote: > > On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: > > At Fri, 15 Jul 2005 13:45:07 -0700, > > Joshua D. Drake wrote: > > > > > > Ron Wills wrote: > > > > Hello all > > > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > > an 3Ware SATA raid. > > > > > > 2 drives? > > > 4 drives? > > > 8 drives? > > > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > > moving the database to the mirrors software raid where the root is > > found and onto the the SATA raid. Neither relieved the IO problems. > > Hard or soft RAID? Which controller? Many of the 3Ware controllers > (85xx and 95xx) have extremely bad RAID 5 performance. > > Did you take any pgbench or other benchmark figures before you started > using the DB? No, unfortunatly, I'm more or less just the developer for the automation systems and admin the system to keep everything going. I have very little say in the hardware used and I don't have any physical access to the machine, it's found a province over :P. But, for what the system, this IO seems unreasonable. I run development on a 1.4Ghz Athlon, Gentoo system, with no raid and I can't reproduce this kind of IO :(. > -jwb
On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: > Here's a bit of a dump of the system that should be useful. > > Processors x2: > > vendor_id : AuthenticAMD > cpu family : 6 > model : 8 > model name : AMD Athlon(tm) MP 2400+ > stepping : 1 > cpu MHz : 2000.474 > cache size : 256 KB > > MemTotal: 903804 kB > > Mandrake 10.0 Linux kernel 2.6.3-19mdk > > The raid controller, which is using the hardware raid configuration: > > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. > scsi0 : 3ware 9000 Storage Controller > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17. > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. > Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 > Type: Direct-Access ANSI SCSI revision: 00 > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) > SCSI device sda: drive cache: write back, no read (daft) > > This is also on a 3.6 reiser filesystem. > > Here's the iostat for 10mins every 10secs. I've removed the stats from > the idle drives to reduce the size of this email. > > Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 > > avg-cpu: %user %nice %sys %iowait %idle > 2.85 1.53 2.15 39.52 53.95 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 82.49 4501.73 188.38 1818836580 76110154 > > avg-cpu: %user %nice %sys %iowait %idle > 0.30 0.00 1.00 96.30 2.40 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 87.80 6159.20 340.00 61592 3400 These I/O numbers are not so horrible, really. 100% iowait is not necessarily a symptom of misconfiguration. It just means you are disk limited. With a database 20 times larger than main memory, this is no surprise. If I had to speculate about the best way to improve your performance, I would say: 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. 1b) Get more disks. 2) Get a (much) newer kernel. 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs By the way, are you experiencing bad application performance, or are you just unhappy with the iostat figures? Regards, jwb
At Fri, 15 Jul 2005 14:53:26 -0700, Jeffrey W. Baker wrote: > > On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: > > Here's a bit of a dump of the system that should be useful. > > > > Processors x2: > > > > vendor_id : AuthenticAMD > > cpu family : 6 > > model : 8 > > model name : AMD Athlon(tm) MP 2400+ > > stepping : 1 > > cpu MHz : 2000.474 > > cache size : 256 KB > > > > MemTotal: 903804 kB > > > > Mandrake 10.0 Linux kernel 2.6.3-19mdk > > > > The raid controller, which is using the hardware raid configuration: > > > > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. > > scsi0 : 3ware 9000 Storage Controller > > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17. > > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. > > Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 > > Type: Direct-Access ANSI SCSI revision: 00 > > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) > > SCSI device sda: drive cache: write back, no read (daft) > > > > This is also on a 3.6 reiser filesystem. > > > > Here's the iostat for 10mins every 10secs. I've removed the stats from > > the idle drives to reduce the size of this email. > > > > Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 > > > > avg-cpu: %user %nice %sys %iowait %idle > > 2.85 1.53 2.15 39.52 53.95 > > > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > > sda 82.49 4501.73 188.38 1818836580 76110154 > > > > avg-cpu: %user %nice %sys %iowait %idle > > 0.30 0.00 1.00 96.30 2.40 > > > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > > sda 87.80 6159.20 340.00 61592 3400 > > These I/O numbers are not so horrible, really. 100% iowait is not > necessarily a symptom of misconfiguration. It just means you are disk > limited. With a database 20 times larger than main memory, this is no > surprise. > > If I had to speculate about the best way to improve your performance, I > would say: > > 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. > 1b) Get more disks. > 2) Get a (much) newer kernel. > 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs Not good news :(. I can't change the hardware, hopefully a kernel update and XFS of JFS will make an improvement. I was hoping for software raid (always has worked well), but the client didn't feel conforable with it :P. > By the way, are you experiencing bad application performance, or are you > just unhappy with the iostat figures? It's affecting the whole system. It is sending the load averages through the roof (from 4 to 12) and processes that would take only a few minutes starts going over an hour, until it clears up. Well, I guess I'll have to drum up some more programming magic... and I'm starting to run out of tricks... I love my job some day :$ > Regards, > jwb >
At Fri, 15 Jul 2005 14:39:36 -0600, Ron Wills wrote: I just wanted to thank everyone for their help. I believe we found a solution that will help with this problem, with the hardware configuration and caching the larger tables into smaller data sets. A valuable lesson learned from this ;) > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? > > Thanks for and help in advance. > > Ron > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly