Re: Redesigning checkpoint_segments

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Redesigning checkpoint_segments
Дата
Msg-id 51B148D3.9010208@2ndQuadrant.com
обсуждение исходный текст
Ответ на Re: Redesigning checkpoint_segments  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: Redesigning checkpoint_segments  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 6/6/13 4:41 AM, Heikki Linnakangas wrote:

> I was thinking of letting the estimate
> decrease like a moving average, but react to any increases immediately.
> Same thing we do in bgwriter to track buffer allocations:

Combine what your submitted patch does and this idea, and you'll have 
something I prototyped a few years ago.  I took the logic and tested it 
out in user space by parsing the output from log_checkpoints to see how 
many segments were being used.  That approach coughed out a value about 
as good for checkpoint_segments as I picked by hand.

The main problem was it liked to over-tune the segments based on a small 
bursts of activity, leaving a value higher than you might want to use 
the rest of the time.  The background writer didn't worry about this 
very much because the cost of making a mistake for one 200ms cycle was 
pretty low.  Setting checkpoint_segments high is a more expensive issue.  When I set these by hand, I'll aim more to
covera 99th percentile of 
 
the maximum segments number rather than every worst case seen.

I don't think that improvement is worth spending very much effort on 
though.  The moving average approach is more than good enough in most 
cases.  I've wanted checkpoint_segments to go away in exactly this 
fashion for a while.

The general complaint the last time I suggested a change in this area, 
to make checkpoint_segments larger for the average user, was that some 
people had seen workloads where that was counterproductive.  Pretty sure 
Kevin Grittner said he'd seen that happen.  That's how I remember this 
general idea dying the last time, and I still don't have enough data to 
refute that doesn't happen.

As far as the UI, if it's a soft limit I'd suggest wal_size_target for 
the name.  What I would like to see is a single number here in memory 
units that replaces both checkpoint_segments and wal_keep_segments.  If 
you're willing to use a large chunk of disk space to handle either one 
of activity spikes or the class of replication issues wal_keep_segments 
targets, I don't see why you'd want to ban using that space for the 
other one too.

To put some perspective on how far we've been able to push this in the 
field with minimal gripes, the repmgr tool requires wal_keep_segments be >=5000, which works out to 78GB.  I still see
somepeople use 73GB SAS 
 
drives in production servers for their WAL files, but that's the only 
time I've seen that number become scary when deploying repmgr. 
Meanwhile, the highest value for checkpoint_segments I've set based on 
real activity levels was 1024, on a server where checkpoint_timeout is 
15 minutes (and can be no shorter without checkpoint spikes).  At no 
point during that fairly difficult but of tuning work did 
checkpoint_segments do anything but get in the way.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: Redesigning checkpoint_segments
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]