Re: Load Distributed Checkpoints test results

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Load Distributed Checkpoints test results
Дата
Msg-id 46796AB6.8060009@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Load Distributed Checkpoints test results  (Heikki Linnakangas <heikki@enterprisedb.com>)
Ответы Re: Load Distributed Checkpoints test results  (Greg Smith <gsmith@gregsmith.com>)
Список pgsql-hackers
I've uploaded the latest test results to the results page at 
http://community.enterprisedb.com/ldc/

The test results on the index page are not in a completely logical 
order, sorry about that.

I ran a series of tests with 115 warehouses, and no surprises there. LDC 
smooths the checkpoints nicely.

Another series with 150 warehouses is more interesting. At that # of 
warehouses, the data disks are 100% busy according to iostat. The 90% 
percentile response times are somewhat higher with LDC, though the 
variability in both the baseline and LDC test runs seem to be pretty 
high. Looking at the response time graphs, even with LDC there's clear 
checkpoint spikes there, but they're much less severe than without.

Another series was with 90 warehouses, but without think times, driving 
the system to full load. LDC seems to smooth the checkpoints very nicely  in these tests.

Heikki Linnakangas wrote:
> Gregory Stark wrote:
>> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>>> Now that the checkpoints are spread out more, the response times are 
>>> very
>>> smooth.
>>
>> So obviously the reason the results are so dramatic is that the 
>> checkpoints
>> used to push the i/o bandwidth demand up over 100%. By spreading it 
>> out you
>> can see in the io charts that even during the checkpoint the i/o busy 
>> rate
>> stays just under 100% except for a few data points.
>>
>> If I understand it right Greg Smith's concern is that in a busier 
>> system where
>> even *with* the load distributed checkpoint the i/o bandwidth demand 
>> during t
>> he checkpoint was *still* being pushed over 100% then spreading out 
>> the load
>> would only exacerbate the problem by extending the outage.
>>
>> To that end it seems like what would be useful is a pair of tests with 
>> and
>> without the patch with about 10% larger warehouse size (~ 115) which 
>> would
>> push the i/o bandwidth demand up to about that level.
> 
> I still don't see how spreading the writes could make things worse, but 
> running more tests is easy. I'll schedule tests with more warehouses 
> over the weekend.
> 
>> It might even make sense to run a test with an outright overloaded to 
>> see if
>> the patch doesn't exacerbate the condition. Something with a warehouse 
>> size of
>> maybe 150. I would expect it to fail the TPCC constraints either way 
>> but what
>> would be interesting to know is whether it fails by a larger margin 
>> with the
>> LDC behaviour or a smaller margin.
> 
> I'll do that as well, though experiences with tests like that in the 
> past have been that it's hard to get repeatable results that way.



--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Marko Kreen"
Дата:
Сообщение: Re: PG-MQ?
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Load Distributed Checkpoints test results