Обсуждение: Write performance

От:
Janning
Дата:

Hi,

at the moment we encounter some performance problems with our database server.

We have a 12 GB RAM machine with intel i7-975 and using
3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
One disk for the system and WAL etc. and one SW RAID-0 with two disks for
postgresql data. Our database is about 24GB.

Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and
reads of about 1000 blocks per second on our disk which holds the data
directories of postgresql (WAL are on a different disk)

3000 blocks ~ about 3 MB/s write
1000 blocks ~ about 1 MB/s read

At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load
(so 4 of 8 cpu cores are in use for io wait)

We know, its a poor man disk setup (but we can not find a hoster with rather
advanced disk configuration at an affordable price). Anyway, we ran some tests
on it:


# time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync"
3000000+0 records in
3000000+0 records out
24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s

real    4m48.658s
user    0m0.580s
sys    0m51.579s

# time dd if=bigfile of=/dev/null bs=8k
3000000+0 records in
3000000+0 records out
24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s

real    3m42.879s
user    0m0.468s
sys    0m18.721s



Of course, writing large chunks is quite a different usage pattern. But I am
wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can
run a test with 89 MB/s writing and 110MB/s reading.

Can you give some hints, if this numbers seems to be reasonable?

kind regards
Janning





От:
Kenneth Marshall
Дата:

On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
> Hi,
>
> at the moment we encounter some performance problems with our database server.
>
> We have a 12 GB RAM machine with intel i7-975 and using
> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
> One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> postgresql data. Our database is about 24GB.
>
> Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and
> reads of about 1000 blocks per second on our disk which holds the data
> directories of postgresql (WAL are on a different disk)
>
> 3000 blocks ~ about 3 MB/s write
> 1000 blocks ~ about 1 MB/s read
>
> At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load
> (so 4 of 8 cpu cores are in use for io wait)
>
> We know, its a poor man disk setup (but we can not find a hoster with rather
> advanced disk configuration at an affordable price). Anyway, we ran some tests
> on it:
>
>
> # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync"
> 3000000+0 records in
> 3000000+0 records out
> 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s
>
> real    4m48.658s
> user    0m0.580s
> sys    0m51.579s
>
> # time dd if=bigfile of=/dev/null bs=8k
> 3000000+0 records in
> 3000000+0 records out
> 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s
>
> real    3m42.879s
> user    0m0.468s
> sys    0m18.721s
>
>
>
> Of course, writing large chunks is quite a different usage pattern. But I am
> wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can
> run a test with 89 MB/s writing and 110MB/s reading.
>
> Can you give some hints, if this numbers seems to be reasonable?
>
> kind regards
> Janning
>

Yes, these are typical random I/O versus sequential I/O rates for
hard drives. Your I/O is extremely under-powered relative to your
CPU/memory. For DB servers, many times you need much more I/O
instead.

Cheers,
Ken

От:
Matthew Wakeling
Дата:

On Thu, 24 Jun 2010, Janning wrote:
> We have a 12 GB RAM machine with intel i7-975 and using
> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"

Those discs are 1.5TB, not 1.5GB.

> One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> postgresql data. Our database is about 24GB.

Beware of RAID-0 - make sure you can recover the data when (not if) a disc
fails.

> Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and
> reads of about 1000 blocks per second on our disk which holds the data
> directories of postgresql (WAL are on a different disk)
>
> 3000 blocks ~ about 3 MB/s write
> 1000 blocks ~ about 1 MB/s read
>
> At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load
> (so 4 of 8 cpu cores are in use for io wait)

Not quite sure what situation you are measuring these figures under.
However, as a typical figure, let's say you are doing random access with
8kB blocks (as in Postgres), and the access time on your drive is 8.5ms
(as with these drives).

For each drive, you will be able to read/write approximately 8kB /
0.0085s, giving 941kB per second. If you have multiple processes all doing
random access, then you may be able to utilise both discs and get double
that.

> Of course, writing large chunks is quite a different usage pattern. But I am
> wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can
> run a test with 89 MB/s writing and 110MB/s reading.

That's quite right, and typical performance figures for a drive like that.

Matthew

--
 Don't criticise a man until you have walked a mile in his shoes; and if
 you do at least he will be a mile behind you and bare footed.

От:
Janning
Дата:

On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
> On Thu, 24 Jun 2010, Janning wrote:
> > We have a 12 GB RAM machine with intel i7-975 and using
> > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
>
> Those discs are 1.5TB, not 1.5GB.

sorry, my fault.

> > One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> > postgresql data. Our database is about 24GB.
>
> Beware of RAID-0 - make sure you can recover the data when (not if) a disc
> fails.

oh sorry again, its a raid-1 of course. shame on me.

> > Our munin graph reports at 9:00 a clock writes of 3000 blocks per second
> > and reads of about 1000 blocks per second on our disk which holds the
> > data directories of postgresql (WAL are on a different disk)
> >
> > 3000 blocks ~ about 3 MB/s write
> > 1000 blocks ~ about 1 MB/s read
> >
> > At the same time we have nearly 50% CPU I/O wait and only 12% user CPU
> > load (so 4 of 8 cpu cores are in use for io wait)
>
> Not quite sure what situation you are measuring these figures under.
> However, as a typical figure, let's say you are doing random access with
> 8kB blocks (as in Postgres), and the access time on your drive is 8.5ms
> (as with these drives).
>
> For each drive, you will be able to read/write approximately 8kB /
> 0.0085s, giving 941kB per second. If you have multiple processes all doing
> random access, then you may be able to utilise both discs and get double
> that.

So with your calculation I have a maximum of 2MB/s random access. So i really
need to upgrade my disk configuration!

> > Of course, writing large chunks is quite a different usage pattern. But I
> > am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if
> > i can run a test with 89 MB/s writing and 110MB/s reading.
>
> That's quite right, and typical performance figures for a drive like that.

thanks for your help.

kind regards
Janning

> Matthew
>
> --
>  Don't criticise a man until you have walked a mile in his shoes; and if
>  you do at least he will be a mile behind you and bare footed.


От:
Janning
Дата:

thanks for your quick response, kenneth

On Thursday 24 June 2010 14:47:34 you wrote:
> On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
> > Hi,
> >
> > at the moment we encounter some performance problems with our database
> > server.
> >
> > We have a 12 GB RAM machine with intel i7-975 and using
> > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
> > One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> > postgresql data. Our database is about 24GB.
[...]
> Your I/O is extremely under-powered relative to your
> CPU/memory. For DB servers, many times you need much more I/O
> instead.

So at the moment we are using this machine as our primary database server:
http://www.hetzner.de/en/hosting/produkte_rootserver/eq9/

Sadly, our hoster is not offering advanced disk setup. Now we have two options

1. buying a server on our own and renting a co-location.
I fear we do not know enough about hardware to vote for this option. I think
for co-locating your own server one should have more knowledge about hardware.

2. renting a server from a hoster with an advanced disk setup.
Can anybody recommend a good hosting solution in germany with a good disk
setup for postgresql?


kind regards
Janning


От:
Greg Smith
Дата:

As others have already pointed out, your disk performance here is
completely typical of a single pair of drives doing random read/write
activity.  So the question you should be asking is how to reduce the
amount of reading and writing needed to run your application.  The
suggestions at
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server address
that.  Increases to shared_buffers and checkpoint_segments in particular
can dramatically reduce the amount of I/O needed to run an application.
On the last server I turned, random reads went from a constant stream of
1MB/s (with default value of shared_buffers at 32MB) to an average of
0.1MB/s just by adjusting those two parameters upwards via those guidelines.

If you haven't already made large increases to those values, I'd suggest
starting there before presuming you must get a different disk setup.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
   www.2ndQuadrant.us


От:
Janning Vygen
Дата:

On Thursday 24 June 2010 15:16:05 Janning wrote:
> On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
> > On Thu, 24 Jun 2010, Janning wrote:
> > > We have a 12 GB RAM machine with intel i7-975 and using
> > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)"
> > >
> > For each drive, you will be able to read/write approximately 8kB /
> > 0.0085s, giving 941kB per second. If you have multiple processes all
> > doing random access, then you may be able to utilise both discs and get
> > double that.
>
> So with your calculation I have a maximum of 2MB/s random access. So i
> really need to upgrade my disk configuration!

i was looking at tomshardware.com and the fastest disk is

   Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm

with 5.5 ms random access time.

So even if i switch to those disks i can only reach a perfomace gain of 1.5,
right?

To achieve a better disk performance by factor of ten, i need a raid-10 setup
with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with
high end disks?

kind regards
Janning


От:
Jesper Krogh
Дата:

On 2010-06-24 15:45, Janning Vygen wrote:
> On Thursday 24 June 2010 15:16:05 Janning wrote:
>
>> On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
>>
>>> On Thu, 24 Jun 2010, Janning wrote:
>>>
>>>> We have a 12 GB RAM machine with intel i7-975 and using
>>>> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)"
>>>>
>>>>
>>> For each drive, you will be able to read/write approximately 8kB /
>>> 0.0085s, giving 941kB per second. If you have multiple processes all
>>> doing random access, then you may be able to utilise both discs and get
>>> double that.
>>>
>> So with your calculation I have a maximum of 2MB/s random access. So i
>> really need to upgrade my disk configuration!
>>
> i was looking at tomshardware.com and the fastest disk is
>
>     Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm
>
> with 5.5 ms random access time.
>
> So even if i switch to those disks i can only reach a perfomace gain of 1.5,
> right?
>
> To achieve a better disk performance by factor of ten, i need a raid-10 setup
> with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with
> high end disks?
>

Well. On the write-side, you can add in a Raid controller with Battery
backed
write cache to not make the writes directly hit disk. This improves
the amount of writing you can do.

On the read-side you can add more memory to your server so a significant
part of your most active dataset is cached in memory.

It depends on the actual sizes and workload what gives the most benefit
for you.

--
Jesper

От:
Scott Carey
Дата:

On Jun 24, 2010, at 6:16 AM, Janning wrote:

> On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
>> On Thu, 24 Jun 2010, Janning wrote:
>>> We have a 12 GB RAM machine with intel i7-975 and using
>>> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
>>
>> Those discs are 1.5TB, not 1.5GB.
>
> sorry, my fault.
>
>>> One disk for the system and WAL etc. and one SW RAID-0 with two disks for
>>> postgresql data. Our database is about 24GB.
>>
>> Beware of RAID-0 - make sure you can recover the data when (not if) a disc
>> fails.
>
> oh sorry again, its a raid-1 of course. shame on me.

If your WAL is not on RAID but your data is, you will lose data if the WAL log drive dies.  You will then have a
difficulttime recovering data from the data drives even though they are RAID protected.  Most likely indexes and some
datawill be corrupted since the last checkpoint.   I have lost a WAL before, and the result was a lot of corrupted
systemindexes that had to be rebuilt in single user mode, and one system table (stats related) that had to be purged
andregenerated from scratch.  This was not fun.  Most of the data was fine, but the cleanup is messy if you lose WAL,
andthere is no guarantee that your data is safe if you don't have the WAL available.