Re: effective_cache_size

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: effective_cache_size
Дата
Msg-id 20181011174849.GC7807@momjian.us
обсуждение исходный текст
Ответ на effective_cache_size  (PG Doc comments form <noreply@postgresql.org>)
Ответы Re: effective_cache_size  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-docs
On Thu, Sep 13, 2018 at 03:14:08PM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/9.6/static/runtime-config-query.html
> Description:
> 
> Hi,
> 
> The effective_cache_size documentation (section 19.7) is: "When setting this
> parameter you should consider both PostgreSQL's shared buffers and the
> portion of the kernel's disk cache that will be used for PostgreSQL data
> files.".
> 
> This description seems essentially stable since v8.1.
> 
> As far as I understand it means that the value associated to
> effective_cache_size is the value of the shared_buffers plus the amount of
> memory often used by the kernel for the buffercache (in Linux's parlance).
> 
> This explanation is offered by many published documents, for example
> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#effective_cache_size
> 
> Alas, this seems different to what B. Momjian declared in 2012 (
> https://www.enterprisedb.com/node/3374 ): "The value doesn't have to be
> perfect, but just a rough estimate of how much kernel memory is acting as
> secondary cache for the shared buffers.", which (as far as I understand)
> means "effective_cache_size = (approx of the often-observed buffercache
> size)".
> 
> In the sourcecode v11b3) the module src/backend/utils/misc/guc.c contains
> "Sets the planner's assumption about the size of the disk cache. That is,
> the portion of the kernel's disk cache that will be used for PostgreSQL data
> files." (so effective_cache_size is more or less the "buffercache" size).
> However in the module src/backend/optimizer/path/costsize.c a comment states
> that 'We also use a rough estimate "effective_cache_size" of the number of
> disk pages in Postgres + OS-level disk cache." (so effective_cache_size is
> more or less the size of the shared_buffers + the size of the
> "buffercache").
> This seems contradictory to me.

Sorry for the delay in replying.  I think you are 100% correct that
there is a problem.  The PG 11 docs say:

    https://www.postgresql.org/docs/11/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
    
    effective_cache_size (integer)
    
        Sets the planner's assumption about the effective size of the disk
                                                --------------------------
    cache that is available to a single query. This is factored into
    -----
    estimates of the cost of using an index; a higher value makes it more
    likely index scans will be used, a lower value makes it more likely
    sequential scans will be used. When setting this parameter you should
    consider both PostgreSQL's shared buffers and the portion of the
             -------------------------------------------------------
    kernel's disk cache that will be used for PostgreSQL data files. Also,
    -------------------
    take into account the expected number of concurrent queries on different
    tables, since they will have to share the available space. This
    parameter has no effect on the size of shared memory allocated by
    PostgreSQL, nor does it reserve kernel disk cache; it is used only for
    estimation purposes. The system also does not assume data remains in the
    disk cache between queries. The default is 4 gigabytes (4GB).

and the server's description of the parameter is:

    SELECT extra_desc FROM pg_settings WHERE name = 'effective_cache_size';
                                                                            
    extra_desc
    -----------------------------------------------------------------------
    
     That is, the portion of the kernel's disk cache that will be used for
                                 -------------------
    PostgreSQL data files. This is measured in disk pages, which are
    normally 8 kB each.

So, there are a few problems here.  First, referring to storage as
"disk" probably needs to be updated in many places --- for me, "disk"
means magnetic disk.  SSDs are solid-state _drives_, not _disks_.  There
is no disk in an SSD.  We can call it "drive cache" or "storage cache".

Second, and I think this was your point, the server's description of the
parameter talks specifically about "kernel's disk cache", while the
documentation talks about simply "disk cache", and mentions shared
buffers are part of that.

Looking at the code, mentions of the C variable effective_cache_size
don't add shared buffers to the computation, so the documentation looks
to be the most accurate.  However, part of the logic is that shared
buffers is assumed to be much smaller than the kernel storage cache,
_and_ pages in the shared buffers are likely to be also in the kernel
storage cache (double-buffering), meaning that the number of data blocks
in the shared buffer cache that are not already in the kernel storage
cache might be near zero, meaning maybe we should ignore the shared
buffers size in computing the storage cache.

It is probably this duplication of buffers in both places that has led
many to ignore the shared buffer size in computing effective_cache_size.

So, I think there are two action items here.  First, rename "disk" to
"drive" or "storage" in the docs and server variable descriptions. 
Second, unify the description of effective_cache_size to talk about
storage cache size, not just kernel storage cache, and maybe even talk
about the double buffering effect.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


В списке pgsql-docs по дате отправления:

Предыдущее
От: Dean Rasheed
Дата:
Сообщение: Re: Proposed HTML Documentation Styles
Следующее
От: Peter Krauss
Дата:
Сообщение: Re: please inform data_directory