Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id 26b69fb2-fa4d-530c-7783-1cb9d952c4e5@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Robert Haas <robertmhaas@gmail.com>)
Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Speed up Clog Access by increasing CLOG buffers  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 09/21/2016 08:04 AM, Amit Kapila wrote:
> On Wed, Sep 21, 2016 at 3:48 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
...
>
>> I'll repeat the test on the 4-socket machine with a newer kernel,
>> but that's probably the last benchmark I'll do for this patch for
>> now.
>>

Attached are results from benchmarks running on kernel 4.5 (instead of
the old 3.2.80). I've only done synchronous_commit=on, and I've added a
few client counts (mostly at the lower end). The data are pushed the
data to the git repository, see

     git push --set-upstream origin master

The summary looks like this (showing both the 3.2.80 and 4.5.5 results):

1) Dilip's workload

  3.2.80                             16     32     64    128    192
-------------------------------------------------------------------
  master                          26138  37790  38492  13653   8337
  granular-locking                25661  38586  40692  14535   8311
  no-content-lock                 25653  39059  41169  14370   8373
  group-update                    26472  39170  42126  18923   8366

  4.5.5                 1      8     16     32     64    128    192
-------------------------------------------------------------------
  granular-locking   4050  23048  27969  32076  34874  36555  37710
  no-content-lock    4025  23166  28430  33032  35214  37576  39191
  group-update       4002  23037  28008  32492  35161  36836  38850
  master             3968  22883  27437  32217  34823  36668  38073


2) pgbench

  3.2.80                             16     32     64    128    192
-------------------------------------------------------------------
  master                          22904  36077  41295  35574   8297
  granular-locking                23323  36254  42446  43909   8959
  no-content-lock                 23304  36670  42606  48440   8813
  group-update                    23127  36696  41859  46693   8345

  4.5.5                 1      8     16     32     64    128    192
-------------------------------------------------------------------
  granular-locking   3116  19235  27388  29150  31905  34105  36359
  no-content-lock    3206  19071  27492  29178  32009  34140  36321
  group-update       3195  19104  26888  29236  32140  33953  35901
  master             3136  18650  26249  28731  31515  33328  35243


The 4.5 kernel clearly changed the results significantly:

(a) Compared to the results from 3.2.80 kernel, some numbers improved,
some got worse. For example, on 3.2.80 pgbench did ~23k tps with 16
clients, on 4.5.5 it does 27k tps. With 64 clients the performance
dropped from 41k tps to ~34k (on master).

(b) The drop above 64 clients is gone - on 3.2.80 it dropped very
quickly to only ~8k with 192 clients. On 4.5 the tps actually continues
to increase, and we get ~35k with 192 clients.

(c) Although it's not visible in the results, 4.5.5 almost perfectly
eliminated the fluctuations in the results. For example when 3.2.80
produced this results (10 runs with the same parameters):

     12118 11610 27939 11771 18065
     12152 14375 10983 13614 11077

we get this on 4.5.5

     37354 37650 37371 37190 37233
     38498 37166 36862 37928 38509

Notice how much more even the 4.5.5 results are, compared to 3.2.80.

(d) There's no sign of any benefit from any of the patches (it was only
helpful >= 128 clients, but that's where the tps actually dropped on
3.2.80 - apparently 4.5.5 fixes that and the benefit is gone).

It's a bit annoying that after upgrading from 3.2.80 to 4.5.5, the
performance with 32 and 64 clients dropped quite noticeably (by more
than 10%). I believe that might be a kernel regression, but perhaps it's
a price for improved scalability for higher client counts.

It of course begs the question what kernel version is running on the
machine used by Dilip (i.e. cthulhu)? Although it's a Power machine, so
I'm not sure how much the kernel matters on it.

I'll ask someone else with access to this particular machine to repeat
the tests, as I have a nagging suspicion that I've missed something
important when compiling / running the benchmarks. I'll also retry the
benchmarks on 3.2.80 to see if I get the same numbers.

>
> Okay, but I think it is better to see the results between 64~128
> client count and may be greater than128 client counts, because it is
> clear that patch won't improve performance below that.
>

There are results for 64, 128 and 192 clients. Why should we care about
numbers in between? How likely (and useful) would it be to get
improvement with 96 clients, but no improvement for 64 or 128 clients?

 >>
>> I agree with Robert that the cases the patch is supposed to
>> improve are a bit contrived because of the very high client
>> counts.
>>
>
> No issues, I have already explained why I think it is important to
> reduce the remaining CLOGControlLock contention in yesterday's and
> this mail. If none of you is convinced, then I think we have no
> choice but to drop this patch.
>

I agree it's useful to reduce lock contention in general, but
considering the last set of benchmarks shows no benefit with recent
kernel, I think we really need a better understanding of what's going
on, what workloads / systems it's supposed to improve, etc.

I don't dare to suggest rejecting the patch, but I don't see how we
could commit any of the patches at this point. So perhaps "returned with
feedback" and resubmitting in the next CF (along with analysis of
improved workloads) would be appropriate.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: pg_upgrade vs user created range type extension
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Why postgres take RowExclusiveLock on all partition