Re: mosbench revisited

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: mosbench revisited
Дата	4 августа 2011 г. 02:00:05
Msg-id	CA+TgmobWi_tFQAFX13VryaW3ZoSxRxVQOebOPLb0SGNEeLZhuw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: mosbench revisited (Jim Nasby <jim@nasby.net>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Aug 3, 2011 at 6:21 PM, Jim Nasby <jim@nasby.net> wrote:
> On Aug 3, 2011, at 1:21 PM, Robert Haas wrote:
>> 1. "We configure PostgreSQL to use a 2 Gbyte application-level cache
>> because PostgreSQL protects its free-list with a single lock and thus
>> scales poorly with smaller caches."  This is a complaint about
>> BufFreeList lock which, in fact, I've seen as a huge point of
>> contention on some workloads.  In fact, on read-only workloads, with
>> my lazy vxid lock patch applied, this is, I believe, the only
>> remaining unpartitioned LWLock that is ever taken in exclusive mode;
>> or at least the only one that's taken anywhere near often enough to
>> matter.  I think we're going to do something about this, although I
>> don't have a specific idea in mind at the moment.
>
> This has been discussed before: http://archives.postgresql.org/pgsql-hackers/2011-03/msg01406.php (which itself
references2 other threads). 
>
> The basic idea is: have a background process that proactively moves buffers onto the free list so that backends
shouldnormally never have to run the clock sweep (which is rather expensive). The challenge there is figuring out how
toget stuff onto the free list with minimal locking impact. I think one possible option would be to put the freelist
underit's own lock (IIRC we currently use it to protect the clock sweep as well). Of course, that still means the free
listlock could be a point of contention, but presumably it's far faster to add or remove something from the list than
itis to run the clock sweep. 

Based on recent benchmarking, I'm going to say "no".  It doesn't seem
to matter how short you make the critical section: a single
program-wide mutex is a loser.  Furthermore, the "free list" is a
joke, because it's nearly always going to be completely empty.  We
could probably just rip that out and use the clock sweep and never
miss it, but I doubt it would improve performance much.

I think what we probably need to do is have multiple clock sweeps in
progress at the same time.  So, for example, if you have 8GB of
shared_buffers, you might have 8 mutexes, one for each GB.  When a
process wants a buffer, it locks one of the mutexes and sweeps through
that 1GB partition.  If it finds a buffer before returning to the
point at which it started the scan, it's done.  Otherwise, it releases
its mutex, grabs the next one, and continues on until it finds a free
buffer.

The trick with any modification in this area is that pretty much any
degree of increased parallelism is potentially going to reduce the
quality of buffer replacement to some degree. So the trick will be to
try to squeeze out as much concurrency as possible while minimizing
degradation in the quality of buffer replacements.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 04 августа 2011 г., 01:32:39
Сообщение: Re: Compressing the AFTER TRIGGER queue

Следующее

От: Heikki Linnakangas
Дата: 04 августа 2011 г., 05:59:21
Сообщение: Re: Further news on Clang - spurious warnings

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: mosbench revisited

Предыдущее

Следующее