Re: Wait free LW_SHARED acquisition - v0.9

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Wait free LW_SHARED acquisition - v0.9
Дата	10 октября 2014 г. 07:43:12
Msg-id	CAA4eK1+tKvXdv5d50S4cmrn=SxuWpJyKmUuFYqszg0Yu=UNV_g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Wait free LW_SHARED acquisition - v0.9 (Andres Freund <andres@2ndquadrant.com>)
Ответы	Re: Wait free LW_SHARED acquisition - v0.9
Список	pgsql-hackers

Дерево обсуждения

On Wed, Oct 8, 2014 at 7:05 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> Hi,
>
> Attached you can find the next version of my LW_SHARED patchset. Now
> that atomics are committed, it seems like a good idea to also add their
> raison d'être.
>
> Since the last public version I have:
> * Addressed lots of Amit's comments. Thanks!
> * Peformed a fair amount of testing.
> * Rebased the code. The volatile removal made that not entirely
> trivial...
> * Significantly cleaned up and simplified the code.
> * Updated comments and such
> * Fixed a minor bug (unpaired HOLD/RESUME_INTERRUPTS in a corner case)
>
> The feature currently consists out of two patches:
> 1) Convert PGPROC->lwWaitLink into a dlist. The old code was frail and
> verbose. This also does:
> * changes the logic in LWLockRelease() to release all shared lockers
> when waking up any. This can yield some significant performance
> improvements - and the fairness isn't really much worse than
> before,
> as we always allowed new shared lockers to jump the queue.
>
> * adds a memory pg_write_barrier() in the wakeup paths between
> dequeuing and unsetting ->lwWaiting. That was always required on
> weakly ordered machines, but f4077cda2 made it more urgent. I can
> reproduce crashes without it.
> 2) Implement the wait free LW_SHARED algorithm.

I have done few performance tests for above patches and results of

same is as below:

Performance Data

------------------------------
IBM POWER-7 16 cores, 64 hardware threads
RAM = 64GB
max_connections =210
Database Locale =C
checkpoint_segments=256
checkpoint_timeout =35min
shared_buffers=8GB
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 5mins

Test type - read only pgbench with -M prepared

Other Related information about test

a. This is the data for median of 3 runs, the detailed data of individual run

is attached with mail.

b. I have applied both the patches to take performance data.

Scale Factor - 100

Patch_ver/Client_count	1	8	16	32	64	128
HEAD	13344	106921	196629	295123	377846	333928
PATCH	13662	106179	203960	298955	452638	465671

Scale Factor - 3000

Patch_ver/Client_count	8	16	32	64	128	160
HEAD	86920	152417	231668	280827	257093	255122
PATCH	87552	160313	230677	276186	248609	244372

Observations

----------------------

a. The patch performs really well (increase upto ~40%) incase all the

data fits in shared buffers (scale factor -100).

b. Incase data doesn't fit in shared buffers, but fits in RAM

(scale factor -3000), there is performance increase upto 16 client count,

however after that it starts dipping (in above config unto ~4.4%).

The above data shows that the patch improves performance for cases

when there is shared LWLock contention, however there is a slight

performance dip in case of Exclusive LWLocks (at scale factor 3000,

it needs exclusive LWLocks for buf mapping tables). Now I am not

sure if this is the worst case dip or under similar configurations the

performance dip can be higher, because the trend shows that dip is

increasing with more client counts.

Brief Analysis of code w.r.t performance dip

---------------------------------------------------------------------

Extra Instructions w.r.t Head in Acquire Exclusive lock path

a. Attempt lock twice

b. atomic operations for nwaiters in LWLockQueueSelf() and

LWLockAcquireCommon()

c. Now we need to take spinlock twice, once for self queuing and then

again for setting releaseOK.

d. few function calls and some extra checks

Similarly there seems to be few additional instructions in

LWLockRelease() path.

Now probably these shouldn't matter much in case backend needs to

wait for other Exclusive locker, but I am not sure what else could be

the reason for dip in case we need to have Exclusive LWLocks.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Вложения

perf_lwlock_contention_data_v1.ods

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Erik Rijkers"
Дата: 10 октября 2014 г., 07:34:47
Сообщение: Re: schema-only -n option in pg_restore fails

Следующее

От: Craig Ringer
Дата: 10 октября 2014 г., 07:51:42
Сообщение: Re: [9.4 bug] The database server hangs with write-heavy workload on Windows

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Wait free LW_SHARED acquisition - v0.9

Вложения

Предыдущее

Следующее