pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

Поиск
Список
Период
Сортировка
От Andres Freund
Тема pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
Дата
Msg-id E1apSHT-0007xJ-Bh@gemulon.postgresql.org
обсуждение исходный текст
Ответы Re: pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-committers
Allow Pin/UnpinBuffer to operate in a lockfree manner.

Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).

To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).

As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths.  There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.

As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.

This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me.  Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.

On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.

Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/48354581a49c30f5757c203415aa8412d85b0f70

Modified Files
--------------
contrib/pg_buffercache/pg_buffercache_pages.c |  15 +-
src/backend/storage/buffer/buf_init.c         |   7 +-
src/backend/storage/buffer/bufmgr.c           | 508 +++++++++++++++++---------
src/backend/storage/buffer/freelist.c         |  44 ++-
src/backend/storage/buffer/localbuf.c         |  64 ++--
src/backend/storage/lmgr/s_lock.c             | 206 ++++++-----
src/include/postmaster/postmaster.h           |  15 +-
src/include/storage/buf_internals.h           | 101 +++--
src/include/storage/s_lock.h                  |  18 +
src/tools/pgindent/typedefs.list              |   1 +
10 files changed, 622 insertions(+), 357 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: pgsql: Avoid the use of a separate spinlock to protect a LWLock's wait
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Fix access-to-already-freed-memory issue in plpython's error han