Re: Scaling shared buffer eviction

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Scaling shared buffer eviction
Дата	22 сентября 2014 г. 06:55:22
Msg-id	CAA4eK1KVMCKPVKkQDcJAw07w1yum_NHggq4hWVT5dR7iwRzu5A@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Scaling shared buffer eviction (Gregory Smith <gregsmithpgsql@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Sep 22, 2014 at 10:43 AM, Gregory Smith <gregsmithpgsql@gmail.com> wrote:

On 9/16/14, 8:18 AM, Amit Kapila wrote:
I think the main reason for slight difference is that
when the size of shared buffers is almost same as data size, the number
of buffers it needs from clock sweep are very less, as an example in first
case (when size of shared buffers is 12286MB), it actually needs at most
256 additional buffers (2MB) via clock sweep, where as bgreclaimer
will put 2000 (high water mark) additional buffers (0.5% of shared buffers
is greater than 2000 ) in free list, so bgreclaimer does some extra work
when it is not required
This is exactly what I was warning about, as the sort of lesson learned from the last round of such tuning. There are going to be spots where trying to tune the code to be aggressive on the hard cases will work great. But you need to make that dynamic to some degree, such that the code doesn't waste a lot of time sweeping buffers when the demand for them is actually weak. That will make all sorts of cases that look like this slower.

To verify whether above can lead to any kind of regression, I have

checked the cases (workload is 0.05 or 0.1 percent larger than shared

buffers) where we need few extra buffers and bgreclaimer might put

some additional buffers and it turns out that in those cases also, there

is a win especially at high concurrency and results of the same are posted

upthread

(http://www.postgresql.org/message-id/CAA4eK1LFGcvzMdcD5NZx7B2gCbP1G7vWK7w32EZk=VOOLUds-A@mail.gmail.com).

We should be able to tell these apart if there's enough instrumentation and solid logic inside of the program itself though. The 8.3 era BGW coped with a lot of these issues using a particular style of moving average with fast reaction time, plus instrumenting the buffer allocation rate as accurately as it could. So before getting into high/low water note questions, are you comfortable that there's a clear, accurate number that measures the activity level that's important here?

Very Good Question. This was exactly the thing which was

missing in my initial versions (about 2 years back when I tried to

solve this problem) but based on Robert's and Andres's feedback

I realized that we need an accurate number to measure the activity

level (in this case it is consumption of buffers from freelist), so

I have introduced the logic to calculate the same (it is stored in new

variable numFreeListBuffers in BufferStrategyControl structure).

And have you considered ways it might be averaging over time or have a history that's analyzed?

The current logic of bgreclaimer is such that even if it does

some extra activity (extra is very much controlled) in one cycle,

it will not start another cycle unless backends consume all the

buffers that were made available by bgreclaimer in one cycle.

I think the algorithm designed for bgreclaimer automatically

averages out based on activity. Do you see any cases where it

will not do so?

The exact fast approach / slow decay weighted moving average approach of the 8.3 BGW, the thing that tried to smooth the erratic data set possible here, was a pretty critical part of getting itself auto-tuning to workload size. It ended up being much more important than the work of setting the arbitrary watermark levels.

Agreed, but the logic with which bgwriter works is pretty different

and thats why it needs different kind of logic to handle auto-tuning.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Scaling shared buffer eviction