Re: BufferAlloc: don't take two simultaneous locks

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: BufferAlloc: don't take two simultaneous locks
Дата
Msg-id CA+Tgmoa=Wxkdn03dtw50etUwdqP1-dJVzb1wvwwvX8UDk4BNYQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BufferAlloc: don't take two simultaneous locks  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Ответы Re: BufferAlloc: don't take two simultaneous locks  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Список pgsql-hackers
On Thu, Apr 21, 2022 at 6:58 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> At the master state:
> - SharedBufHash is not declared as HASH_FIXED_SIZE
> - get_hash_entry falls back to element_alloc too fast (just if it doesn't
>   found free entry in current freelist partition).
> - get_hash_entry has races.
> - if there are small number of spare items (and NUM_BUFFER_PARTITIONS is
>   small number) and HASH_FIXED_SIZE is set, it becomes contended and
>   therefore slow.
>
> HASH_REUSE solves (for shared buffers) most of this issues. Free list
> became rare fallback, so HASH_FIXED_SIZE for SharedBufHash doesn't lead
> to performance hit. And with fair number of spare items, get_hash_entry
> will find free entry despite its races.

Hmm, I see. The idea of trying to arrange to reuse entries rather than
pushing them onto a freelist and immediately trying to take them off
again is an interesting one, and I kind of like it. But I can't
imagine that anyone would commit this patch the way you have it. It's
way too much action at a distance. If any ereport(ERROR,...) could
happen between the HASH_REUSE operation and the subsequent HASH_ENTER,
it would be disastrous, and those things are separated by multiple
levels of call stack across different modules, so mistakes would be
easy to make. If this could be made into something dynahash takes care
of internally without requiring extensive cooperation with the calling
code, I think it would very possibly be accepted.

One approach would be to have a hash_replace() call that takes two
const void * arguments, one to delete and one to insert. Then maybe
you propagate that idea upward and have, similarly, a BufTableReplace
operation that uses that, and then the bufmgr code calls
BufTableReplace instead of BufTableDelete. Maybe there are other
better ideas out there...

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: make MaxBackends available in _PG_init
Следующее
От: Tom Lane
Дата:
Сообщение: Re: libpq async duplicate error results