Re: Scaling shared buffer eviction

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Scaling shared buffer eviction
Дата
Msg-id CA+TgmoZ4=Sot9Cw81NfQ0MAbEBbsCKgnsFUPMWcJb5V11wftJQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Scaling shared buffer eviction  (Gregory Smith <gregsmithpgsql@gmail.com>)
Ответы Re: Scaling shared buffer eviction  (Gregory Smith <gregsmithpgsql@gmail.com>)
Список pgsql-hackers
On Tue, Sep 23, 2014 at 6:02 PM, Gregory Smith <gregsmithpgsql@gmail.com> wrote:
> On 9/23/14, 10:31 AM, Robert Haas wrote:
>> I suggest we count these things:
>>
>> 1. The number of buffers the reclaimer has put back on the free list.
>> 2. The number of times a backend has run the clocksweep.
>> 3. The number of buffers past which the reclaimer has advanced the clock
>> sweep (i.e. the number of buffers it had to examine in order to reclaim the
>> number counted by #1).
>> 4. The number of buffers past which a backend has advanced the clocksweep
>> (i.e. the number of buffers it had to examine in order to allocate the
>> number of buffers count by #3).
>> 5. The number of buffers allocated from the freelist which the backend did
>> not use because they'd been touched (what you're calling
>> buffers_touched_freelist).
>
> All sound reasonable.  To avoid wasting time here, I think it's only worth
> doing all of these as DEBUG level messages for now.  Then only go through
> the overhead of exposing the ones that actually seem relevant.  That's what
> I did for the 8.3 work, and most of that data at this level was barely
> relevant to anyone but me then or since. We don't want the system views to
> include so much irrelevant trivia that finding the important parts becomes
> overwhelming.

I think we expose far too little information in our system views.
Just to take one example, we expose no useful information about lwlock
acquire or release, but a lot of real-world performance problems are
caused by lwlock contention.  There are of course difficulties in
exposing huge numbers of counters, but we're not talking about many
here, so I'd lean toward exposing them in the final patch if they seem
at all useful.

> I'd like to see that level of instrumentation--just the debug level
> messages--used to quantify the benchmarks that people are running already,
> to prove they are testing what they think they are.  That would have caught
> the test error you already stumbled on for example.  Simple paranoia says
> there may be more issues like that hidden in here somewhere, and this set
> you've identified should find any more of them around.

Right.

> If all that matches up so the numbers for the new counters seem sane, I
> think that's enough to commit the first basic improvement here.  Then make a
> second pass over exposing the useful internals that let everyone quantify
> either the improvement or things that may need to be tunable.

Well, I posted a patch a bit ago that I think is the first basic
improvement - and none of these counters are relevant to that.  It
doesn't add a new background process or anything; it just does pretty
much the same thing we do now with less-painful locking.  There are no
water marks to worry about, or tunable thresholds, or anything; and
because it's so much simpler, it's far easier to reason about than the
full patch, which is why I feel quite confident pressing on toward a
commit.

Once that is in, I think we should revisit the idea of a bgreclaimer
process, and see how much more that improves things - if at all - on
top of what that basic patch already does.  For that we'll need these
counters, and maybe others.  But let's make that phase two.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Scaling shared buffer eviction
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Scaling shared buffer eviction