Re: Controlling Load Distributed Checkpoints

Поиск

Список

Период

Сортировка

От	Gregory Stark
Тема	Re: Controlling Load Distributed Checkpoints
Дата	7 июня 2007 г. 19:28:37
Msg-id	87y7ivef8k.fsf@oxford.xeocode.com обсуждение исходный текст
Ответ на	Re: Controlling Load Distributed Checkpoints (Greg Smith <gsmith@gregsmith.com>)
Ответы	Re: Controlling Load Distributed Checkpoints (Greg Smith <gsmith@gregsmith.com>) Re: .conf File Organization WAS: Controlling Load Distributed Checkpoints (Josh Berkus <josh@agliodbs.com>)
Список	pgsql-hackers

Дерево обсуждения

"Greg Smith" <gsmith@gregsmith.com> writes:

> I'm completely biased because of the workloads I've been dealing with recently,
> but I consider (2) so much easier to tune for that it's barely worth worrying
> about.  If your system is so underloaded that you can let the checkpoints take
> their own sweet time, I'd ask if you have enough going on that you're suffering
> very much from checkpoint performance issues anyway.  I'm used to being in a
> situation where if you don't push out checkpoint data as fast as physically
> possible, you end up fighting with the client backends for write bandwidth once
> the LRU point moves past where the checkpoint has written out to already.  I'm
> not sure how much always running the LRU background writer will improve that
> situation.

I think you're working from a faulty premise.

There's no relationship between the volume of writes and how important the
speed of checkpoint is. In either scenario you should assume a system that is
close to the max i/o bandwidth. The only question is which task the admin
would prefer take the hit for maxing out the bandwidth, the transactions or
the checkpoint.

You seem to have imagined that letting the checkpoint take longer will slow
down transactions. In fact that's precisely the effect we're trying to avoid.
Right now we're seeing tests where Postgres stops handling *any* transactions
for up to a minute. In virtually any real world scenario that would simply be
unacceptable.

That one-minute outage is a direct consequence of trying to finish the
checkpoint as quick as possible. If we spread it out then it might increase
the average i/o load if you sum it up over time, but then you just need a
faster i/o controller. 

The only scenario where you would prefer the absolute lowest i/o rate summed
over time would be if you were close to maxing out your i/o bandwidth,
couldn't buy a faster controller, and response time was not a factor, only
sheer volume of transactions processed mattered. That's a much less common
scenario than caring about the response time.

The flip side of having to worry about response time buying a faster
controller doesn't even help. It would shorten the duration of the checkpoint
but not eliminate it. A 30-second outage every half hour is just as
unacceptable as a 1-minute outage every half hour.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 07 июня 2007 г., 19:27:44
Сообщение: Re: Autovacuum launcher doesn't notice death of postmaster immediately

Следующее

От: "Matthew T. O'Connor"
Дата: 07 июня 2007 г., 20:24:33
Сообщение: Re: Autovacuum launcher doesn't notice death of postmaster immediately

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Controlling Load Distributed Checkpoints

Предыдущее

Следующее