Re: BufFreelistLock

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: BufFreelistLock
Дата
Msg-id DC555169-6758-4996-B51C-E9B3845385BC@nasby.net
обсуждение исходный текст
Ответ на Re: BufFreelistLock  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: BufFreelistLock  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On Dec 14, 2010, at 11:08 AM, Jeff Janes wrote:

> On Sun, Dec 12, 2010 at 6:48 PM, Jim Nasby <jim@nasby.net> wrote:
>>
>> BTW, when we moved from 96G to 192G servers I tried increasing shared buffers from 8G to 28G and performance went
downenough to be noticeable (we don't have any good benchmarks, so I cant really quantify the degradation). Going back
to8G brought performance back up, so it seems like it was the change in shared buffers that caused the issue (the
largerservers also have 24 cores vs 16). 
>
> What kind of work load do you have (intensity of reading versus
> writing)?  How intensely concurrent is the access?

It writes at the rate of ~3-5MB/s, doing ~700TPS on average. It's hard to judge the exact read mix, because it's
runningon a 192G server (actually, 512G now, but 192G when I tested). The working set is definitely between 96G and
192G;we saw a major performance improvement last year when we went to 192G, but we haven't seen any improvement moving
to512G. 

We typically have 10-20 active queries at any point.

>> My immediate thought was that we needed more lock partitions, but I haven't had the chance to see if that helps.
ISTMthe issue could just as well be due to clock sweep suddenly taking over 3x longer than before. 
>
> It would surprise me if most clock sweeps need to make anything near a
> full pass over the buffers for each allocation (but technically it
> wouldn't need to do that take 3x longer.  It could be that the
> fraction of a pass it needs to make is merely proportional to
> shared_buffers.  That too would surprise me, though).  You could
> compare the number of passes with the number of allocations to see how
> much sweeping is done per allocation.  However, I don't think the
> number of passes is reported anywhere, unless you compile with #define
> BGW_DEBUG and
> run with debug2.
>
> I wouldn't expect an increase in shared_buffers to make contention on
> BufFreelistLock worse.  If the increased buffers are used to hold
> heavily-accessed data, then you will find the pages you want in
> shared_buffers more often, and so need to run the clock-sweep less
> often.  That should make up for longer sweeps.  But if the increased
> buffers are used to hold data that is just read once and thrown away,
> then the clock sweep shouldn't need to sweep very far before finding a
> candidate.

Well, we're talking about a working set that's between 96 and 192G, but only 8G (or 28G) of shared buffers. So there's
goingto be a pretty large amount of buffer replacement happening. We also have 210 tables where the ratio of heap
bufferhits to heap reads is over 1000, so the stuff that is in shared buffers probably keeps usage_count quite high.
Putthese two together, and we're probably spending a fairly significant amount of time running the clock sweep. 

Even excluding our admittedly unusual workload, there is still significant overhead in running the clock sweep vs just
grabbingsomething off of the free list (assuming we had separate locks for the two operations). Does anyone know what
theoverhead of getting a block from the filesystem cache is? I wonder how many buffers you can move through in the same
amountof time. Put another way, at some point you have to check enough buffers to find a free one that you just doubled
theamount of time it takes to get data from the filesystem cache into a shared buffer. 

> But of course being able to test would be better than speculation.

Yeah, I'm working on getting pg_buffercache installed so we can see what's actually in the cache.

Hmm... I wonder how hard it would be to hack something up that has a separate process that does nothing but run the
clocksweep. We'd obviously not run a hack in production, but we're working on being able to reproduce a production
workload.If we had a separate clock-sweep process we could get an idea of exactly how much work was involved in keeping
freebuffers available. 

BTW, given our workload I can't see any way of running at debug2 without having a large impact on performance.
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Florian Pflug
Дата:
Сообщение: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Следующее
От: Tom Lane
Дата:
Сообщение: Re: unlogged tables vs. GIST