Re: Redesigning checkpoint_segments

Поиск

Список

Период

Сортировка

От	Greg Smith
Тема	Re: Redesigning checkpoint_segments
Дата	7 июня 2013 г. 02:43:37
Msg-id	51B148D3.9010208@2ndQuadrant.com обсуждение исходный текст
Ответ на	Re: Redesigning checkpoint_segments (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы	Re: Redesigning checkpoint_segments
Список	pgsql-hackers

Дерево обсуждения

On 6/6/13 4:41 AM, Heikki Linnakangas wrote:

> I was thinking of letting the estimate
> decrease like a moving average, but react to any increases immediately.
> Same thing we do in bgwriter to track buffer allocations:

Combine what your submitted patch does and this idea, and you'll have
something I prototyped a few years ago. I took the logic and tested it
out in user space by parsing the output from log_checkpoints to see how
many segments were being used. That approach coughed out a value about
as good for checkpoint_segments as I picked by hand.

The main problem was it liked to over-tune the segments based on a small
bursts of activity, leaving a value higher than you might want to use
the rest of the time. The background writer didn't worry about this
very much because the cost of making a mistake for one 200ms cycle was
pretty low. Setting checkpoint_segments high is a more expensive issue. When I set these by hand, I'll aim more to
covera 99th percentile of

the maximum segments number rather than every worst case seen.

I don't think that improvement is worth spending very much effort on
though. The moving average approach is more than good enough in most
cases. I've wanted checkpoint_segments to go away in exactly this
fashion for a while.

The general complaint the last time I suggested a change in this area,
to make checkpoint_segments larger for the average user, was that some
people had seen workloads where that was counterproductive. Pretty sure
Kevin Grittner said he'd seen that happen. That's how I remember this
general idea dying the last time, and I still don't have enough data to
refute that doesn't happen.

As far as the UI, if it's a soft limit I'd suggest wal_size_target for
the name. What I would like to see is a single number here in memory
units that replaces both checkpoint_segments and wal_keep_segments. If
you're willing to use a large chunk of disk space to handle either one
of activity spikes or the class of replication issues wal_keep_segments
targets, I don't see why you'd want to ban using that space for the
other one too.

To put some perspective on how far we've been able to push this in the
field with minimal gripes, the repmgr tool requires wal_keep_segments be >=5000, which works out to 78GB. I still see
somepeople use 73GB SAS

drives in production servers for their WAL files, but that's the only
time I've seen that number become scary when deploying repmgr.
Meanwhile, the highest value for checkpoint_segments I've set based on
real activity levels was 1024, on a server where checkpoint_timeout is
15 minutes (and can be no shorter without checkpoint spikes). At no
point during that fairly difficult but of tuning work did
checkpoint_segments do anything but get in the way.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Redesigning checkpoint_segments