Обсуждение: Perplexing, regular decline in performance

Поиск

Список

Период

Сортировка

Perplexing, regular decline in performance

От

Hugh Ranalli

Дата:

25 июня 2019 г., 15:49:03

I'm hoping people can help me figure out where to look to solve an odd PostgreSQL performance problem.

A bit of background: We have a client with a database of approximately 450 GB, that has a couple of tables storing large amounts of text, including full HTML pages from the Internet. Last fall, they began experiencing dramatic and exponentially decreasing performance. We track certain query times, so we know how much time is being spent in calls to the database for these functions. When this began, the times went from about an average of approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900 ms, figures we had never seen before, within 4 days, with no appreciable change in usage. It was at this point that we restarted the database server and times returned to the 400 ms range, but never back to their long-running original levels. From this point onward, we had to restart the database (originally the server, but eventually just the database process) every 3-4 days, otherwise the application became unusable.

As they were still on PostgreSQL 8.2, we persuaded them to finally undertake our long-standing recommendation to upgrade, as there was no possibility of support on that platform. That upgrade to 11.2 was completed successfully in mid-May, and although times have not returned to their original levels (they now average approximately 250 ms), the application overall seems much more responsive and faster (application servers were not changed, other than minor changes --full text search, explicit casts, etc.-- to conform to PostgreSQL 11's requirements).

What we continued to notice was a milder but still definite trend of increased query times, during the course of each week, from the mid to high 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had noticed that as the number of "raw_page" columns in a particular table grew, performance would decline. They wrote a script that once a week locks the table, deletes the processed large columns (they are not needed after processing), copies the remaining data to a backup table, truncates the original table, then copies it back. When this script runs we see an immediate change in performance, from 380 ms in the hour before the drop, to 250 ms in the hour of the drop. As rows with these populated columns are added during the course of a week, the performance drops, steadily, until the next week's cleaning operation. Each week the performance increase is clear and significant.

What is perplexing is (and I have triple checked), that this table is *not* referenced in any way in the queries that we time (it is referenced by ongoing administrative and processing queries). The operation that cleans it frees up approximately 15-20 GB of space each week. Our system monitoring shows this change in free disk space, but this is 20 GB out of approximately 300 GB of free space (free space is just under 40% of volume size), so disk space does not seem to be an issue. The table in question is about 21 GB in size, with about 20 GB in toast data, at its largest.

Even odder, the queries we time *do* reference a much larger table, which contains very similar data, and multiple columns of it. It is 355 GB in size, with 318 GB in toast data. It grows continually, with no cleaning.

If anyone has any suggestions as to what sort of statistics to look at, or why this would be happening, they would be greatly appreciated.

Thanks in advance,

Hugh

--
Hugh Ranalli
Principal Consultant
White Horse Technology Consulting
e: hugh@whtc.ca
c: +01-416-994-7957

Re: Perplexing, regular decline in performance

От

Justin Pryzby

Дата:

25 июня 2019 г., 16:23:38

On Tue, Jun 25, 2019 at 11:49:03AM -0400, Hugh Ranalli wrote:
> I'm hoping people can help me figure out where to look to solve an odd
> PostgreSQL performance problem.

What kernel?  Version?  OS?

If Linux, I wonder if transparent hugepages or KSM are enabled ?  It seems
possible that truncating the table is clearing enough RAM to mitigate the
issue, similar to restarting the DB.
tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
/sys/kernel/mm/transparent_hugepage/enabled/sys/kernel/mm/transparent_hugepage/defrag
 
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

11.2 would have parallel query, and enabled by default.  Are there other
settings you've changed (or not changed)?
https://wiki.postgresql.org/wiki/Server_Configuration

It's possible that the "administrative" queries are using up lots of your
shared_buffers, which are (also/more) needed by the customer-facing queries.  I
would install pg_buffercache to investigate.  Or, just pause the admin queries
and see if that the issue goes away during that interval ?

SELECT 1.0*COUNT(1)/sum(count(1))OVER(), COUNT(1), COUNT(nullif(isdirty,'f')), datname, COALESCE(c.relname,
b.relfilenode::text),d.relname TOAST, 1.0*COUNT(nullif(isdirty,'f'))/count(1) dirtyfrac, avg(usagecount) FROM
pg_buffercacheb JOIN pg_database db ON b.reldatabase=db.oid LEFT JOIN pg_class c ON
b.relfilenode=pg_relation_filenode(c.oid)LEFT JOIN pg_class d ON c.oid=d.reltoastrelid GROUP BY 4,5,6 ORDER BY 1 DESC
LIMIT9; 
 

Could you send query plan for the slow (customer-facing) queries?
https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN

> A bit of background: We have a client with a database of approximately 450
> GB, that has a couple of tables storing large amounts of text, including
> full HTML pages from the Internet. Last fall, they began experiencing
> dramatic and exponentially decreasing performance. We track certain query
> times, so we know how much time is being spent in calls to the database for
> these functions. When this began, the times went from about an average of
> approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900
> ms, figures we had never seen before, within 4 days, with no appreciable
> change in usage. It was at this point that we restarted the database server
> and times returned to the 400 ms range, but never back to their
> long-running original levels. From this point onward, we had to restart the
> database (originally the server, but eventually just the database process)
> every 3-4 days, otherwise the application became unusable.
> 
> As they were still on PostgreSQL 8.2, we persuaded them to finally
> undertake our long-standing recommendation to upgrade, as there was no
> possibility of support on that platform. That upgrade to 11.2 was completed
> successfully in mid-May, and although times have not returned to their
> original levels (they now average approximately 250 ms), the application
> overall seems much more responsive and faster (application servers were not
> changed, other than minor changes --full text search, explicit casts,
> etc.-- to conform to PostgreSQL 11's requirements).
> 
> What we continued to notice was a milder but still definite trend of
> increased query times, during the course of each week, from the mid to high
> 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had
> noticed that as the number of "raw_page" columns in a particular table
> grew, performance would decline. They wrote a script that once a week locks
> the table, deletes the processed large columns (they are not needed after
> processing), copies the remaining data to a backup table, truncates the
> original table, then copies it back. When this script runs we see an
> immediate change in performance, from 380 ms in the hour before the drop,
> to 250 ms in the hour of the drop. As rows with these populated columns are
> added during the course of a week, the performance drops, steadily, until
> the next week's cleaning operation. Each week the performance increase is
> clear and significant.
> 
> What is perplexing is (and I have triple checked), that this table is *not*
> referenced in any way in the queries that we time (it is referenced by
> ongoing administrative and processing queries). The operation that cleans
> it frees up approximately 15-20 GB of space each week. Our system
> monitoring shows this change in free disk space, but this is 20 GB out of
> approximately 300 GB of free space (free space is just under 40% of volume
> size), so disk space does not seem to be an issue. The table in question is
> about 21 GB in size, with about 20 GB in toast data, at its largest.
> 
> Even odder, the queries we time *do* reference a much larger table, which
> contains very similar data, and multiple columns of it. It is 355 GB in
> size, with 318 GB in toast data. It grows continually, with no cleaning.
> 
> If anyone has any suggestions as to what sort of statistics to look at, or
> why this would be happening, they would be greatly appreciated.

Re: Perplexing, regular decline in performance

От

Peter Geoghegan

Дата:

26 июня 2019 г., 18:52:46

On Tue, Jun 25, 2019 at 8:49 AM Hugh Ranalli <hugh@whtc.ca> wrote:
> What we continued to notice was a milder but still definite trend of increased query times, during the course of each
week,from the mid to high 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had noticed that as the
numberof "raw_page" columns in a particular table grew, performance would decline. They wrote a script that once a week
locksthe table, deletes the processed large columns (they are not needed after processing), copies the remaining data
toa backup table, truncates the original table, then copies it back. When this script runs we see an immediate change
inperformance, from 380 ms in the hour before the drop, to 250 ms in the hour of the drop. As rows with these populated
columnsare added during the course of a week, the performance drops, steadily, until the next week's cleaning
operation.Each week the performance increase is clear and significant. 

Can you show us the definition of the table, including its indexes?
Can you describe the data and distribution of values within the
columns, particularly where they're indexed?

--
Peter Geoghegan

Re: Perplexing, regular decline in performance

От

Hugh Ranalli

Дата:

26 июня 2019 г., 19:00:43

On Tue, 25 Jun 2019 at 12:23, Justin Pryzby <pryzby@telsasoft.com> wrote:

What kernel? Version? OS?

Ubuntu 18.04; current kernel is 4.15.0-51-generic4

If Linux, I wonder if transparent hugepages or KSM are enabled ? It seems
possible that truncating the table is clearing enough RAM to mitigate the
issue, similar to restarting the DB.
tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

==> /sys/kernel/mm/ksm/run <==
0
==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <==
1
==> /sys/kernel/mm/transparent_hugepage/enabled <==
always [madvise] never
==> /sys/kernel/mm/transparent_hugepage/defrag <==
always defer defer+madvise [madvise] never

From my research in preparing for the upgrade, I understood transparent huge pages were a good thing, and should be enabled. Is this not correct?

11.2 would have parallel query, and enabled by default. Are there other
settings you've changed (or not changed)?
https://wiki.postgresql.org/wiki/Server_Configuration

I've just posted the parameters we are changing from the default in a previous reply, so I won't repeat them here unless you want me to.

It's possible that the "administrative" queries are using up lots of your
shared_buffers, which are (also/more) needed by the customer-facing queries. I
would install pg_buffercache to investigate. Or, just pause the admin queries
and see if that the issue goes away during that interval ?

Pausing the admin queries isn't an option in our environment, especially as the issue reveals itself over the course of days, not minutes or hours.

?column? | count | count | datname | coalesce | toast | dirtyfrac | avg
------------------------+---------+-------+-----------+-------------------------+----------------+----------------------------+--------------------
0.24904101286779650995 | 1044545 | 0 | mydb | position | | 0.000000000000000000000000 | 4.8035517857057379
0.16701241622795295199 | 700495 | 0 | mydb | stat_position_click | | 0.000000000000000000000000 | 1.9870234619804567
0.09935032779251879171 | 416702 | 6964 | mydb | pg_toast_19788 | harvested_job | 0.01671218280689797505 | 1.9346079452462431
0.06979762146872315533 | 292750 | 0 | mydb | url | | 0.000000000000000000000000 | 4.9627873612297182
0.03795774662998486745 | 159205 | 0 | mydb | stat_sponsored_position | | 0.000000000000000000000000 | 1.8412361420809648
0.02923155381784048663 | 122605 | 0 | mydb | pg_toast_20174 | page | 0.000000000000000000000000 | 3.0259532645487541
0.02755283459406156353 | 115564 | 0 | mydb | location | | 0.000000000000000000000000 | 4.9953532241874632
0.02015273698468076320 | 84526 | 1122 | mydb | harvested_job | | 0.01327402219435439983 | 4.9922154130090150
0.01913348905375406298 | 80251 | 0 | mydb | pg_toast_20257 | position_index | 0.000000000000000000000000 | 4.9880001495308470

harvested_job is the rapidly growing "problematic" table I am talking about. page is the 355 GB table that gets referenced on the public searches. I'll google, but is there a place I should look to understand what I am seeing here? Also, Should pg_buffercache perhaps be run at the beginning and end of the week, to see if there is a significant difference?

Could you send query plan for the slow (customer-facing) queries?
https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN

I can, but can I ask why this would matter? I'm not looking to optimise the query (although I'm sure it could be; this is a legacy system with lots of barnacles). The problem is that the same query performs increasingly slowly over the course of a week, seemingly in sync with the rows with a large toast column added to one particular table (which, as I mentioned, isn't referenced by the query in question). Wouldn't the plan be the same at both the start of the week (when the problematic table is essentially empty) and at the end (when it is much larger)?

Thanks!

Hugh

Re: Perplexing, regular decline in performance

От

Hugh Ranalli

Дата:

26 июня 2019 г., 19:02:15

On Wed, 26 Jun 2019 at 14:52, Peter Geoghegan <pg@bowt.ie> wrote:

Can you show us the definition of the table, including its indexes?
Can you describe the data and distribution of values within the
columns, particularly where they're indexed?

I'm sorry, but I'm not sure what you mean by the "distribution of values within the columns." Can you clarify or provide an link to an example?

Thanks,

Hugh

Re: Perplexing, regular decline in performance

От

Peter Geoghegan

Дата:

26 июня 2019 г., 19:05:56

On Wed, Jun 26, 2019 at 12:02 PM Hugh Ranalli <hugh@whtc.ca> wrote:
> I'm sorry, but I'm not sure what you mean by the "distribution of values within the columns." Can you clarify or
providean link to an example?
 

I would mostly just like to see the schema of the table in question,
including indexes, and a high-level description of the nature of the
data in the table. Ideally, you would also include pg_stats.*
information for all columns in the table. That will actually let us
see a summary of the data. Though you should be careful about leaking
sensitive information that happens to be contained in the statistics,
such as the most common values.

-- 
Peter Geoghegan

Re: Perplexing, regular decline in performance

От

Alvaro Herrera

Дата:

26 июня 2019 г., 19:07:43

On 2019-Jun-26, Hugh Ranalli wrote:

> From my research in preparing for the upgrade, I understood transparent
> huge pages were a good thing, and should be enabled. Is this not correct?

It is not.

> Wouldn't the plan be the same at both
> the start of the week (when the problematic table is essentially empty) and
> at the end (when it is much larger)?

Not necessarily.  Though, if a plan change was the culprit you would
probably see a sudden change in performance characteristics rather than
gradual.  Worth making sure, anyway.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Perplexing, regular decline in performance

От

Tom Lane

Дата:

26 июня 2019 г., 19:18:14

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Jun-26, Hugh Ranalli wrote:
>> From my research in preparing for the upgrade, I understood transparent
>> huge pages were a good thing, and should be enabled. Is this not correct?

> It is not.

Yeah ... they would be a good thing perhaps if the quality of the kernel
implementation were better.  But there are way too many nasty corner
cases, at least with the kernel versions people around here have
experimented with.  You're best off to disable THP and instead manually
arrange for Postgres' shared memory to use huge pages.  I forget where
to look for docs about doing that, but I think we have some.

            regards, tom lane

Re: Perplexing, regular decline in performance

От

Justin Pryzby

Дата:

27 июня 2019 г., 00:24:42

On Wed, Jun 26, 2019 at 03:00:43PM -0400, Hugh Ranalli wrote:
> Pausing the admin queries isn't an option in our environment, especially as
> the issue reveals itself over the course of days, not minutes or hours.

Perhaps you can pause it for a short while at EOW and see if there's a dramatic
improvement ?

>         ?column?        |  count  | count |  datname  |        coalesce     |     toast      |         dirtyfrac
   |        avg
 
>
------------------------+---------+-------+-----------+-------------------------+----------------+----------------------------+--------------------
>  0.24904101286779650995 | 1044545 |     0 | mydb      | position            |                |
0.000000000000000000000000| 4.8035517857057379
 
>  0.16701241622795295199 |  700495 |     0 | mydb      | stat_position_click |                |
0.000000000000000000000000| 1.9870234619804567
 
>  0.09935032779251879171 |  416702 |  6964 | mydb      | pg_toast_19788      | harvested_job  |
0.01671218280689797505| 1.9346079452462431
 
>  0.06979762146872315533 |  292750 |     0 | mydb      | url                 |                |
0.000000000000000000000000| 4.9627873612297182
 
>  0.03795774662998486745 |  159205 |     0 | mydb      | stat_sponsored_position |                |
0.000000000000000000000000| 1.8412361420809648
 
>  0.02923155381784048663 |  122605 |     0 | mydb      | pg_toast_20174      | page           |
0.000000000000000000000000| 3.0259532645487541
 
>  0.02755283459406156353 |  115564 |     0 | mydb      | location            |                |
0.000000000000000000000000| 4.9953532241874632
 
>  0.02015273698468076320 |   84526 |  1122 | mydb      | harvested_job       |                |
0.01327402219435439983| 4.9922154130090150
 
>  0.01913348905375406298 |   80251 |     0 | mydb      | pg_toast_20257      | position_index |
0.000000000000000000000000| 4.9880001495308470
 
> 
> harvested_job is the rapidly growing "problematic" table I am talking
> about. page is the 355 GB table that gets referenced on the public
> searches. I'll google, but is there a place I should look to understand
> what I am seeing here?

I should label the columns:
|buffer_fraction           | nbuffers| ndirty| datname   | relname             | toast         | dirtyfrac
   | avgusage
 

It looks like possibly harvested job is being index scanned, and its toast
table is using up many buffers.  At the EOW, maybe that number is at the
expense of more important data.  You could check pg_stat_user_tables/indexes
for stats on that.  Possibly you could make use of index-only scans using
covering indexes (pg11 supports INCLUDE).  Or maybe it's just too big (maybe it
should be partitioned or maybe index should be repacked?)

> Also, Should pg_buffercache perhaps be run at the beginning and end of the
> week, to see if there is a significant difference?

Yes; buffercache can be pretty volatile, so I'd save it numerous times each at
beginning and end of week.

> > Could you send query plan for the slow (customer-facing) queries?
> >
> > https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN
> 
> I can, but can I ask why this would matter?

My very tentative guess is that harvested_job itself isn't the issue, but some
other, 3rd thing is the issue, which also increases (at least roughly) with
time, same as that table.  It'd help to see the buffer cache hit rate for that
query (and its different query plan nodes), at beginning and EOW.

Justin

Re: Perplexing, regular decline in performance

От

Hugh Ranalli

Дата:

17 июля 2019 г., 14:29:28

On Wed, 26 Jun 2019 at 15:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Jun-26, Hugh Ranalli wrote:
>> From my research in preparing for the upgrade, I understood transparent
>> huge pages were a good thing, and should be enabled. Is this not correct?

> It is not.

Yeah ... they would be a good thing perhaps if the quality of the kernel
implementation were better. But there are way too many nasty corner
cases, at least with the kernel versions people around here have
experimented with. You're best off to disable THP and instead manually
arrange for Postgres' shared memory to use huge pages. I forget where
to look for docs about doing that, but I think we have some.

We've been dealing with some other production issues, so my apologies for not replying sooner. I'm seeing now that I have confused huge pages with transparent huge pages. We have a maintenance window coming up this weekend, so we'll disable transparent huge pages and configure huge pages manually. I found the docs here: https://www.postgresql.org/docs/11/kernel-resources.html#LINUX-HUGE-PAGES

Thank you very much!

Re: Perplexing, regular decline in performance

От

Alvaro Herrera

Дата:

17 июля 2019 г., 17:55:51

On 2019-Jun-26, Justin Pryzby wrote:

> > Also, Should pg_buffercache perhaps be run at the beginning and end of the
> > week, to see if there is a significant difference?
> 
> Yes; buffercache can be pretty volatile, so I'd save it numerous times each at
> beginning and end of week.

Be careful with pg_buffercache though, as it can cause a hiccup in
operation.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Perplexing, regular decline in performance

От

Andres Freund

Дата:

17 июля 2019 г., 18:52:54

Hi

On 2019-07-17 13:55:51 -0400, Alvaro Herrera wrote:
> Be careful with pg_buffercache though, as it can cause a hiccup in
> operation.

I think that's been fixed a few years back:

commit 6e654546fb61f62cc982d0c8f62241b3b30e7ef8
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   2016-09-29 13:16:30 +0300

    Don't bother to lock bufmgr partitions in pg_buffercache.

    That makes the view a lot less disruptive to use on a production system.
    Without the locks, you don't get a consistent snapshot across all buffers,
    but that's OK. It wasn't a very useful guarantee in practice.

    Ivan Kartyshov, reviewed by Tomas Vondra and Robert Haas.

    Discusssion: <f9d6cab2-73a7-7a84-55a8-07dcb8516ae5@postgrespro.ru>

so everything from 10 onwards ought to be fine.

Greetings,

Andres Freund

Re: Perplexing, regular decline in performance

От

Hugh Ranalli

Дата:

18 июля 2019 г., 20:01:46

On Tue, 25 Jun 2019 at 12:23, Justin Pryzby <pryzby@telsasoft.com> wrote:

It's possible that the "administrative" queries are using up lots of your
shared_buffers, which are (also/more) needed by the customer-facing queries. I
would install pg_buffercache to investigate. Or, just pause the admin queries
and see if that the issue goes away during that interval ?

SELECT 1.0*COUNT(1)/sum(count(1))OVER(), COUNT(1), COUNT(nullif(isdirty,'f')), datname, COALESCE(c.relname, b.relfilenode::text), d.relname TOAST, 1.0*COUNT(nullif(isdirty,'f'))/count(1) dirtyfrac, avg(usagecount) FROM pg_buffercache b JOIN pg_database db ON b.reldatabase=db.oid LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON c.oid=d.reltoastrelid GROUP BY 4,5,6 ORDER BY 1 DESC LIMIT 9;

I've been going by a couple of articles I found about interpreting pg_buffercache (https://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers), and so far shared buffers look okay. Our database is 486 GB, with shared buffers set to 32 GB. The article suggests a query that can provide a guideline for what shared buffers should be:

SELECT
pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
FROM
pg_class c
INNER JOIN
pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN
pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE
usagecount >= 3;

This comes out to 25 GB, and even dropping the usage count to 1 only raises it to 30 GB. I realise this is only a guideline, and I may bump it to 36 GB, to give a bit more space.

I did run some further queries to look at usage (based on the same article), and most of the tables that have very high usage on all the buffered data are 100% buffered, so, if I understand it correctly, there should be little churn there. The others seem to have sufficient less-accessed space to make room for data that they need to buffer:

relname | buffered | buffers_percent | percent_of_relation
-------------------------+----------+-----------------+---------------------
position | 8301 MB | 25.3 | 99.2
stat_position_click | 7359 MB | 22.5 | 76.5
url | 2309 MB | 7.0 | 100.0
pg_toast_19788 | 1954 MB | 6.0 | 49.3 (harvested_job)
stat_sponsored_position | 1585 MB | 4.8 | 92.3
location | 927 MB | 2.8 | 98.7
pg_toast_20174 | 866 MB | 2.6 | 0.3 (page)
pg_toast_20257 | 678 MB | 2.1 | 92.9 (position_index)
harvested_job | 656 MB | 2.0 | 100.0
stat_employer_click | 605 MB | 1.8 | 100.0

usagecount >= 5
relname | pg_size_pretty
-------------------------+----------------
harvested_job | 655 MB
location | 924 MB
pg_toast_19788 | 502 MB
pg_toast_20174 | 215 MB
pg_toast_20257 | 677 MB
position | 8203 MB
stat_employer_click | 605 MB
stat_position_click | 79 MB
stat_sponsored_position | 304 kB
url | 2307 MB

usagecount >= 3
relname | pg_size_pretty
-------------------------+----------------
harvested_job | 656 MB
location | 927 MB
pg_toast_19788 | 1809 MB
pg_toast_20174 | 589 MB
pg_toast_20257 | 679 MB
position | 8258 MB
stat_employer_click | 605 MB
stat_position_click | 716 MB
stat_sponsored_position | 2608 kB
url | 2309 MB

usagecount >= 1
relname | pg_size_pretty
-------------------------+----------------
harvested_job | 656 MB
location | 928 MB
pg_toast_19788 | 3439 MB
pg_toast_20174 | 842 MB
pg_toast_20257 | 680 MB
position | 8344 MB
stat_employer_click | 605 MB
stat_position_click | 4557 MB
stat_sponsored_position | 86 MB
url | 2309 MB

If I'm misreading this, please let me know. I know people also asked about query plans and schema, which I'm going to look at next; I've just been knocking off one thing at at time.

Thanks,

Hugh

Re: Perplexing, regular decline in performance

От

Andres Freund

Дата:

18 июля 2019 г., 22:23:21

Hi,

On 2019-07-18 16:01:46 -0400, Hugh Ranalli wrote:
> I've been going by a couple of articles I found about interpreting
> pg_buffercache (
> https://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers),
> and so far shared buffers look okay. Our database is 486 GB, with shared
> buffers set to 32 GB. The article suggests a query that can provide a
> guideline for what shared buffers should be:
> 
> SELECT
>     pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
> FROM
>     pg_class c
> INNER JOIN
> pg_buffercache b ON b.relfilenode = c.relfilenode
> INNER JOIN
>     pg_database d ON (b.reldatabase = d.oid AND d.datname =
> current_database())
> WHERE
>     usagecount >= 3;

IMO that's not a meaningful way to determine the ideal size of shared
buffers. Except for the case where shared buffers is bigger than the
entire working set (not just the hot working set), it's going to give
you completely bogus results.

Pretty much by definition it cannot give you a shared buffers size
bigger than what it's currently set to, given that it starts with the
number of shared buffers.

And there's plenty scenarios where you'll commonly see many frequently
(but not most frequently) used buffers with a usagecount < 3 even =
0. If you e.g. have a shared_buffers size that's just a few megabytes
too small, you'll need to throw some buffers out of shared buffers -
that means the buffer replacement search will go through all shared
buffers and decrement the usagecount by one, until it finds a buffer
with a count of 0 (before it has decremented the count). Which means
it's extremely likely that there's moments where a substantial number of
frequently used buffers have a lowered usagecount (perhaps even 0).

Therefore, the above query will commonly give you a lower number than
shared buffers, if your working set size is *bigger* than shared memory.

I think you can assume that shared buffers is too big if a substantial
portion of buffers have relfilenode IS NOT NULL (i.e. are unused); at
least if you don't continually also DROP/TRUNCATE relations.

If there's a large fluctuation about which parts of buffercache has a
high usagecount, then that's a good indication that very frequently new
buffers are needed (because that lowers a good portion of buffers to
usagecount 0).

I've had decent success in the past getting insights with a query like:

SELECT
    ceil(bufferid/(nr_buffers/subdivisions::float))::int AS part,
    to_char(SUM((relfilenode IS NOT NULL)::int) / count(*)::float * 100, '999D99')  AS pct_used,
    to_char(AVG(usagecount), '9D9') AS avg_usagecount,
    to_char(SUM((usagecount=0)::int) / SUM((relfilenode IS NOT NULL)::int)::float8 * 100, '999D99') AS pct_0
FROM
    pg_buffercache,
    (SELECT 10) AS x(subdivisions),
    (SELECT setting::int nr_buffers FROM pg_settings WHERE name = 'shared_buffers') s
GROUP BY 1 ORDER BY 1;

which basically subdivides pg_buffercache's output into 10 parts (or use
as much as fit comfortable in one screen / terminal).

Here's e.g. the output of a benchmark (pgbench) running against a
database that's considerably smaller than shared memory (15GB database,
1.5GB shared_buffers):

┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  1.0           │   42.75 │
│    2 │  100.00  │   .6           │   47.85 │
│    3 │  100.00  │   .6           │   47.25 │
│    4 │  100.00  │   .6           │   47.52 │
│    5 │  100.00  │   .6           │   47.18 │
│    6 │  100.00  │   .5           │   48.47 │
│    7 │  100.00  │   .5           │   49.00 │
│    8 │  100.00  │   .5           │   48.52 │
│    9 │  100.00  │   .5           │   49.27 │
│   10 │  100.00  │   .5           │   49.58 │
│   11 │   99.98  │   .6           │   46.88 │
│   12 │  100.00  │   .6           │   45.23 │
│   13 │  100.00  │   .6           │   45.03 │
│   14 │  100.00  │   .6           │   44.90 │
│   15 │  100.00  │   .6           │   46.08 │
│   16 │  100.00  │   .6           │   44.84 │
│   17 │  100.00  │   .6           │   45.88 │
│   18 │  100.00  │   .6           │   46.46 │
│   19 │  100.00  │   .6           │   46.64 │
│   20 │  100.00  │   .6           │   47.05 │
└──────┴──────────┴────────────────┴─────────┘

As you can see usagecounts are pretty low overall. That's because the
buffer replacement rate is so high, that the usagecount is very
frequently reduced to 0 (to get new buffers).

You can infer from that, that unless you add a lot of shared buffers,
you're not likely going to make a huge difference (but if you set it
16GB, it'd obviously look much better).

In contrast to that, here's pgbench running on a smaller database, that
nearly fits into shared buffers (2GB DB, 1.5GB shared_buffers):

┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  3.9           │    1.45 │
│    2 │  100.00  │  3.8           │    1.34 │
│    3 │  100.00  │  3.8           │    1.69 │
│    4 │  100.00  │  3.7           │    1.96 │
│    5 │  100.00  │  3.7           │    2.01 │
│    6 │  100.00  │  3.6           │    2.23 │
│    7 │  100.00  │  3.5           │    2.60 │
│    8 │  100.00  │  3.5           │    2.27 │
│    9 │  100.00  │  3.4           │    2.82 │
│   10 │  100.00  │  3.3           │    2.92 │
│   11 │  100.00  │  3.2           │    3.43 │
│   12 │  100.00  │  3.1           │    3.41 │
│   13 │  100.00  │  3.7           │    1.91 │
│   14 │  100.00  │  4.0           │    1.09 │
│   15 │  100.00  │  3.9           │    1.39 │
│   16 │  100.00  │  4.0           │    1.22 │
│   17 │  100.00  │  4.1           │    1.16 │
│   18 │  100.00  │  4.0           │    1.19 │
│   19 │  100.00  │  4.0           │    1.29 │
│   20 │  100.00  │  4.0           │    1.42 │
└──────┴──────────┴────────────────┴─────────┘

As you can see, there's many fewer buffers that have a usagecount of 0 -
that's because the buffer replacement rate is much lower (as most
buffers are in shared buffers), and thus the usagecount has time to
"increase" regularly.

Here you can guess that even just increasing shared buffers slightly,
would increase the cache hit ratio substantially. E.g. the same
workload, but with shared_buffes increased to 1.6GB:
┌──────┬──────────┬────────────────┬─────────┐
│ part │ pct_used │ avg_usagecount │  pct_0  │
├──────┼──────────┼────────────────┼─────────┤
│    1 │  100.00  │  5.0           │     .00 │
│    2 │  100.00  │  5.0           │     .00 │
│    3 │  100.00  │  5.0           │     .00 │
│    4 │  100.00  │  5.0           │     .00 │
│    5 │  100.00  │  5.0           │     .00 │
│    6 │  100.00  │  5.0           │     .00 │
│    7 │  100.00  │  5.0           │     .00 │
│    8 │  100.00  │  5.0           │     .00 │
│    9 │  100.00  │  5.0           │     .00 │
│   10 │  100.00  │  5.0           │     .00 │
│   11 │  100.00  │  5.0           │     .00 │
│   12 │  100.00  │  5.0           │     .00 │
│   13 │  100.00  │  5.0           │     .00 │
│   14 │  100.00  │  5.0           │     .00 │
│   15 │  100.00  │  5.0           │     .00 │
│   16 │  100.00  │  5.0           │     .00 │
│   17 │  100.00  │  5.0           │     .00 │
│   18 │  100.00  │  5.0           │     .00 │
│   19 │   93.27  │  5.0           │     .00 │
│   20 │     .00  │ (null)         │ (null)  │
└──────┴──────────┴────────────────┴─────────┘

Now, in reality things are rarely quite this neat - pgbench has a
uniform access pattern, which isn't that common in the real world.

I also suggest to monitor how the buffer hit ratio develops over
time. E.g. by doing a query like

SELECT datname, (blks_hit - blks_read)::float/NULLIF(blks_hit, 0)::float FROM pg_stat_database;

although that's not perfect, because it gives you the ratio since the
last time the stats have been reset, making it hard to see more recent
changes.  So you either need to reset the stats, or just compute the
difference to what the values where when you wanted to start observing.

E.g.

DROP TABLE IF EXISTS pg_stat_database_snap;CREATE TEMPORARY TABLE pg_stat_database_snap AS SELECT * FROM
pg_stat_database;

SELECT datname,
    (blks_hit - blks_read)::float/NULLIF(blks_hit, 0)::float
FROM (
    SELECT datname,
        pd.blks_read - ps.blks_read AS blks_read,
        pd.blks_hit - ps.blks_hit AS blks_hit
    FROM pg_stat_database pd JOIN pg_stat_database_snap ps USING (datname) ) pd_diff;

Greetings,

Andres Freund

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Perplexing, regular decline in performance