Re: Huge iowait during checkpoint finish

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Huge iowait during checkpoint finish
Дата
Msg-id 4B4B9F2F.4030504@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Huge iowait during checkpoint finish  (Scott Marlowe <scott.marlowe@gmail.com>)
Ответы Re: Huge iowait during checkpoint finish  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: Huge iowait during checkpoint finish  (Craig Ringer <craig@postnewspapers.com.au>)
Список pgsql-general
Scott Marlowe wrote:
On Mon, Jan 11, 2010 at 3:53 AM, Anton Belyaev <anton.belyaev@gmail.com> wrote: 
Old RAID-1 has "hardware" LSI controller.
I still have access to old server.   
The old RAID card likely had a battery backed cache, which would make
the fsyncs much faster, as long as you hadn't run out of cache. 

To be super clear here:  it's possible to see a 100:1 performance drop going from a system with a battery-backed write cache to one that doesn't.  This one of the three main weak spots of software RAID that still keeps hardware RAID vendors in business:  it can't do anything to speed up the type of writes done during transactions commit and at checkpoint time.  (The others are that it's hard to setup transparent failover after failure in software RAID so that it always works at boot time, and that motherboard chipsets can easily lose their minds and take down the whole system when one drive goes bad).

If you can shoehorn one more drive, you could run RAID-10 and get much
better performance. 
And throwing drives at the problem may not help.  I've see a system with a 48 disk software RAID-10 that only got 100 TPS when running a commit-heavy test, because it didn't have any way to cache writes usefully for that purpose.

If the old system had a write caching card, and the new one doesn't, that's certainly your most likely suspect for the source of the slowdown.  As for testing that specifically, if you have the old system too you can look at the slides I've got for "Database Hardware Benchmarking" at http://www.westnet.com/~gsmith/content/postgresql/index.htm and use the sysbench example I show on P26 to measure commit fsync rate.  There's a video of that presentation where I explain a lot of the background in this area too.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Andy Colson
Дата:
Сообщение: Re: migration: parameterized statement and cursor
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Huge iowait during checkpoint finish