Re: Perplexing, regular decline in performance

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Perplexing, regular decline in performance
Дата
Msg-id 20190718222321.r4b3lmi62cukq2gi@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Perplexing, regular decline in performance  (Hugh Ranalli <hugh@whtc.ca>)
Список pgsql-performance
Hi,

On 2019-07-18 16:01:46 -0400, Hugh Ranalli wrote:
> I've been going by a couple of articles I found about interpreting
> pg_buffercache (
> https://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers),
> and so far shared buffers look okay. Our database is 486 GB, with shared
> buffers set to 32 GB. The article suggests a query that can provide a
> guideline for what shared buffers should be:
> 
> SELECT
>     pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
> FROM
>     pg_class c
> INNER JOIN
> pg_buffercache b ON b.relfilenode = c.relfilenode
> INNER JOIN
>     pg_database d ON (b.reldatabase = d.oid AND d.datname =
> current_database())
> WHERE
>     usagecount >= 3;

IMO that's not a meaningful way to determine the ideal size of shared
buffers. Except for the case where shared buffers is bigger than the
entire working set (not just the hot working set), it's going to give
you completely bogus results.

Pretty much by definition it cannot give you a shared buffers size
bigger than what it's currently set to, given that it starts with the
number of shared buffers.

And there's plenty scenarios where you'll commonly see many frequently
(but not most frequently) used buffers with a usagecount < 3 even =
0. If you e.g. have a shared_buffers size that's just a few megabytes
too small, you'll need to throw some buffers out of shared buffers -
that means the buffer replacement search will go through all shared
buffers and decrement the usagecount by one, until it finds a buffer
with a count of 0 (before it has decremented the count). Which means
it's extremely likely that there's moments where a substantial number of
frequently used buffers have a lowered usagecount (perhaps even 0).

Therefore, the above query will commonly give you a lower number than
shared buffers, if your working set size is *bigger* than shared memory.


I think you can assume that shared buffers is too big if a substantial
portion of buffers have relfilenode IS NOT NULL (i.e. are unused); at
least if you don't continually also DROP/TRUNCATE relations.

If there's a large fluctuation about which parts of buffercache has a
high usagecount, then that's a good indication that very frequently new
buffers are needed (because that lowers a good portion of buffers to
usagecount 0).

I've had decent success in the past getting insights with a query like:

SELECT
    ceil(bufferid/(nr_buffers/subdivisions::float))::int AS part,
    to_char(SUM((relfilenode IS NOT NULL)::int) / count(*)::float * 100, '999D99')  AS pct_used,
    to_char(AVG(usagecount), '9D9') AS avg_usagecount,
    to_char(SUM((usagecount=0)::int) / SUM((relfilenode IS NOT NULL)::int)::float8 * 100, '999D99') AS pct_0
FROM
    pg_buffercache,
    (SELECT 10) AS x(subdivisions),
    (SELECT setting::int nr_buffers FROM pg_settings WHERE name = 'shared_buffers') s
GROUP BY 1 ORDER BY 1;

which basically subdivides pg_buffercache's output into 10 parts (or use
as much as fit comfortable in one screen / terminal).

Here's e.g. the output of a benchmark (pgbench) running against a
database that's considerably smaller than shared memory (15GB database,
1.5GB shared_buffers):

┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  1.0           │   42.75 │
│    2 │  100.00  │   .6           │   47.85 │
│    3 │  100.00  │   .6           │   47.25 │
│    4 │  100.00  │   .6           │   47.52 │
│    5 │  100.00  │   .6           │   47.18 │
│    6 │  100.00  │   .5           │   48.47 │
│    7 │  100.00  │   .5           │   49.00 │
│    8 │  100.00  │   .5           │   48.52 │
│    9 │  100.00  │   .5           │   49.27 │
│   10 │  100.00  │   .5           │   49.58 │
│   11 │   99.98  │   .6           │   46.88 │
│   12 │  100.00  │   .6           │   45.23 │
│   13 │  100.00  │   .6           │   45.03 │
│   14 │  100.00  │   .6           │   44.90 │
│   15 │  100.00  │   .6           │   46.08 │
│   16 │  100.00  │   .6           │   44.84 │
│   17 │  100.00  │   .6           │   45.88 │
│   18 │  100.00  │   .6           │   46.46 │
│   19 │  100.00  │   .6           │   46.64 │
│   20 │  100.00  │   .6           │   47.05 │
└──────┴──────────┴────────────────┴─────────┘

As you can see usagecounts are pretty low overall. That's because the
buffer replacement rate is so high, that the usagecount is very
frequently reduced to 0 (to get new buffers).

You can infer from that, that unless you add a lot of shared buffers,
you're not likely going to make a huge difference (but if you set it
16GB, it'd obviously look much better).


In contrast to that, here's pgbench running on a smaller database, that
nearly fits into shared buffers (2GB DB, 1.5GB shared_buffers):

┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  3.9           │    1.45 │
│    2 │  100.00  │  3.8           │    1.34 │
│    3 │  100.00  │  3.8           │    1.69 │
│    4 │  100.00  │  3.7           │    1.96 │
│    5 │  100.00  │  3.7           │    2.01 │
│    6 │  100.00  │  3.6           │    2.23 │
│    7 │  100.00  │  3.5           │    2.60 │
│    8 │  100.00  │  3.5           │    2.27 │
│    9 │  100.00  │  3.4           │    2.82 │
│   10 │  100.00  │  3.3           │    2.92 │
│   11 │  100.00  │  3.2           │    3.43 │
│   12 │  100.00  │  3.1           │    3.41 │
│   13 │  100.00  │  3.7           │    1.91 │
│   14 │  100.00  │  4.0           │    1.09 │
│   15 │  100.00  │  3.9           │    1.39 │
│   16 │  100.00  │  4.0           │    1.22 │
│   17 │  100.00  │  4.1           │    1.16 │
│   18 │  100.00  │  4.0           │    1.19 │
│   19 │  100.00  │  4.0           │    1.29 │
│   20 │  100.00  │  4.0           │    1.42 │
└──────┴──────────┴────────────────┴─────────┘

As you can see, there's many fewer buffers that have a usagecount of 0 -
that's because the buffer replacement rate is much lower (as most
buffers are in shared buffers), and thus the usagecount has time to
"increase" regularly.

Here you can guess that even just increasing shared buffers slightly,
would increase the cache hit ratio substantially. E.g. the same
workload, but with shared_buffes increased to 1.6GB:
┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  5.0           │     .00 │
│    2 │  100.00  │  5.0           │     .00 │
│    3 │  100.00  │  5.0           │     .00 │
│    4 │  100.00  │  5.0           │     .00 │
│    5 │  100.00  │  5.0           │     .00 │
│    6 │  100.00  │  5.0           │     .00 │
│    7 │  100.00  │  5.0           │     .00 │
│    8 │  100.00  │  5.0           │     .00 │
│    9 │  100.00  │  5.0           │     .00 │
│   10 │  100.00  │  5.0           │     .00 │
│   11 │  100.00  │  5.0           │     .00 │
│   12 │  100.00  │  5.0           │     .00 │
│   13 │  100.00  │  5.0           │     .00 │
│   14 │  100.00  │  5.0           │     .00 │
│   15 │  100.00  │  5.0           │     .00 │
│   16 │  100.00  │  5.0           │     .00 │
│   17 │  100.00  │  5.0           │     .00 │
│   18 │  100.00  │  5.0           │     .00 │
│   19 │   93.27  │  5.0           │     .00 │
│   20 │     .00  │ (null)         │ (null)  │
└──────┴──────────┴────────────────┴─────────┘


Now, in reality things are rarely quite this neat - pgbench has a
uniform access pattern, which isn't that common in the real world.


I also suggest to monitor how the buffer hit ratio develops over
time. E.g. by doing a query like

SELECT datname, (blks_hit - blks_read)::float/NULLIF(blks_hit, 0)::float FROM pg_stat_database;

although that's not perfect, because it gives you the ratio since the
last time the stats have been reset, making it hard to see more recent
changes.  So you either need to reset the stats, or just compute the
difference to what the values where when you wanted to start observing.

E.g.

DROP TABLE IF EXISTS pg_stat_database_snap;CREATE TEMPORARY TABLE pg_stat_database_snap AS SELECT * FROM
pg_stat_database;

SELECT datname,
    (blks_hit - blks_read)::float/NULLIF(blks_hit, 0)::float
FROM (
    SELECT datname,
        pd.blks_read - ps.blks_read AS blks_read,
        pd.blks_hit - ps.blks_hit AS blks_hit
    FROM pg_stat_database pd JOIN pg_stat_database_snap ps USING (datname) ) pd_diff;


Greetings,

Andres Freund



В списке pgsql-performance по дате отправления:

Предыдущее
От: Gavin Flower
Дата:
Сообщение: Re: Searching in varchar column having 100M records
Следующее
От: mayank rupareliya
Дата:
Сообщение: Re: Searching in varchar column having 100M records