Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id 20150903114137.GE27649@awork2.anarazel.de
обсуждение исходный текст
Ответ на Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 2015-09-01 10:19:19 +0530, Amit Kapila wrote:
> pgbench setup
> ------------------------
> scale factor - 300
> Data is on magnetic disk and WAL on ssd.
> pgbench -M prepared tpc-b
> 
> HEAD - commit 0e141c0f
> Patch-1 - increase_clog_bufs_v1
> 
> Client Count/Patch_ver 1 8 16 32 64 128 256 HEAD 911 5695 9886 18028 27851
> 28654 25714 Patch-1 954 5568 9898 18450 29313 31108 28213
> 
> 
> This data shows that there is an increase of ~5% at 64 client-count
> and 8~10% at more higher clients without degradation at lower client-
> count. In above data, there is some fluctuation seen at 8-client-count,
> but I attribute that to run-to-run variation, however if anybody has doubts
> I can again re-verify the data at lower client counts.

> Now if we try to further increase the number of CLOG buffers to 128,
> no improvement is seen.
> 
> I have also verified that this improvement can be seen only after the
> contention around ProcArrayLock is reduced.  Below is the data with
> Commit before the ProcArrayLock reduction patch.  Setup and test
> is same as mentioned for previous test.

The buffer replacement algorithm for clog is rather stupid - I do wonder
where the cutoff is that it hurts.

Could you perhaps try to create a testcase where xids are accessed that
are so far apart on average that they're unlikely to be in memory? And
then test that across a number of client counts?

There's two reasons that I'd like to see that: First I'd like to avoid
regression, second I'd like to avoid having to bump the maximum number
of buffers by small buffers after every hardware generation...

>  /*
>   * Number of shared CLOG buffers.
>   *
> - * Testing during the PostgreSQL 9.2 development cycle revealed that on a
> + * Testing during the PostgreSQL 9.6 development cycle revealed that on a
>   * large multi-processor system, it was possible to have more CLOG page
> - * requests in flight at one time than the number of CLOG buffers which existed
> - * at that time, which was hardcoded to 8.  Further testing revealed that
> - * performance dropped off with more than 32 CLOG buffers, possibly because
> - * the linear buffer search algorithm doesn't scale well.
> + * requests in flight at one time than the number of CLOG buffers which
> + * existed at that time, which was 32 assuming there are enough shared_buffers.
> + * Further testing revealed that either performance stayed same or dropped off
> + * with more than 64 CLOG buffers, possibly because the linear buffer search
> + * algorithm doesn't scale well or some other locking bottlenecks in the
> + * system mask the improvement.
>   *
> - * Unconditionally increasing the number of CLOG buffers to 32 did not seem
> + * Unconditionally increasing the number of CLOG buffers to 64 did not seem
>   * like a good idea, because it would increase the minimum amount of shared
>   * memory required to start, which could be a problem for people running very
>   * small configurations.  The following formula seems to represent a reasonable
>   * compromise: people with very low values for shared_buffers will get fewer
> - * CLOG buffers as well, and everyone else will get 32.
> + * CLOG buffers as well, and everyone else will get 64.
>   *
>   * It is likely that some further work will be needed here in future releases;
>   * for example, on a 64-core server, the maximum number of CLOG requests that
>   * can be simultaneously in flight will be even larger.  But that will
>   * apparently require more than just changing the formula, so for now we take
> - * the easy way out.
> + * the easy way out.  It could also happen that after removing other locking
> + * bottlenecks, further increase in CLOG buffers can help, but that's not the
> + * case now.
>   */

I think the comment should be more drastically rephrased to not
reference individual versions and numbers.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: GIN pending clean up is not interruptable
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_ctl/pg_rewind tests vs. slow AIX buildfarm members