Обсуждение: Autoanalyze CPU usage

Поиск

Список

Период

Сортировка

Autoanalyze CPU usage

От

Habib Nahas

Дата:

19 декабря 2017 г., 22:47:52

Hi,

We operate an RDS postgres 9.5 instance and have periodic CPU spikes to 100%. These spikes appear to be due to autoanalyze kicking on our larger tables.

Our largest table has 75 million rows and the autoanalyze scale factor is set to 0.05.

The documentation I've read suggests that the analyze always operates on the entire table and is not incremental. Given that supposition are there ways to control cost(especially CPU) of the autoanalyze operation? Would a more aggressive autoanalyze scale factor (0.01) help. With the current scale factor we see an autoanalyze once a week, query performance has been acceptable so far, which could imply that scale factor could be increased if necessary.

Thanks,

Habib Nahas

Re: Autoanalyze CPU usage

От

Justin Pryzby

Дата:

19 декабря 2017 г., 23:09:58

On Tue, Dec 19, 2017 at 08:47:52AM -0800, Habib Nahas wrote:
> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> 100%. These spikes appear to be due to autoanalyze kicking on our larger
> tables.

Not sure if it'll help you, but for our large, insert-only tables partitioned
by time, I made several changes from default:

 - near the end of month, report queries for previous day's data had poor
   statistics, because autoanalyze scale factor defaults to 0.1, so a table
   analyzed on the 24th of the month won't be analyzed again until the 26th, so
   the histogram shows that there's zero rows for previous day, causing nested
   loop over thousands of rows.
 - for separate reasons, I increased statistics target on our key columns (up
   to 3000 for one column).
 - large stats target on large tables caused (auto)analyze to use large amount
   of RAM.  Therefor I changed our largest tables from monthly partition
   granuliarity (YYYYMM) to daily (YYYYMMDD).  That creates what's
   traditionally considered to be an excessive number of partitions (and very
   large pg_attribute/attrdef and pg_statistic tables), but avoids the huge RAM
   issue, and works for our purposes (and I hope the traditional advice for
   number of child tables is relaxed in upcoming versions, too).

One possibility is a cronjob to set deafult "scale factor" to a modest/default
values (0.1) during business hours and an aggressive value (0.005) off-hours.
You could do similar with autovacuum_max_workers ... but beware if they're
causing high RAM use.  I believe autovacuum workers try to "play nice" and the
cost are shared between all workers.  But I suspect that's not true for CPU
cost or RAM use, so there's nothing stopping you from having 9 workers each
looping around 2+GB RAM and 100% CPU doing MCV/histogram computation.

Maybe that's of some use.

Justin

Re: Autoanalyze CPU usage

От

Tomas Vondra

Дата:

20 декабря 2017 г., 04:03:18

On 12/19/2017 05:47 PM, Habib Nahas wrote:
> Hi,
> 
> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> 100%. These spikes appear to be due to autoanalyze kicking on our larger
> tables.
> 
> Our largest table has 75 million rows and the autoanalyze scale factor
> is set to 0.05. 
> 
> The documentation I've read suggests that the analyze always operates on
> the entire table and is not incremental. Given that supposition are
> there ways to control cost(especially CPU) of the autoanalyze operation?
> Would a more aggressive autoanalyze scale factor (0.01) help. With the
> current scale factor we see an autoanalyze once a week, query
> performance has been acceptable so far, which could imply that scale
> factor could be increased if necessary. 
> 

No, reducing the scale factor to 0.01 will not help at all, it will
actually make the issue worse. The only thing autoanalyze does is
running ANALYZE, which *always* collects a fixed-size sample. Making it
more frequent will not reduce the amount of work done on each run.

So the first question is if you are not using the default (0.1), i.e.
have you reduced it to 0.05.

The other question is why it's so CPU-intensive. Are you using the
default statistics_target value (100), or have you increased that too?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Autoanalyze CPU usage

От

Habib Nahas

Дата:

20 декабря 2017 г., 04:53:25

The autoanalyze factor is set to 0.05 for the db, and we have not changed the default statistics target.

The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the table shows a time of 12:49; last_autovacuum does not show any activity around this time for any table. Checkpoint logs are also normal around this time. I'd like to understand if there are any other sources of activity I should be checking for that would account for the spike.

User workload is throttled to avoid excess load on the db, so a query is unlikely to have caused the spike. But we can dig deeper if other causes are ruled out.

Thanks

On Tue, Dec 19, 2017 at 2:03 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 12/19/2017 05:47 PM, Habib Nahas wrote:
> Hi,
>
> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> 100%. These spikes appear to be due to autoanalyze kicking on our larger
> tables.
>
> Our largest table has 75 million rows and the autoanalyze scale factor
> is set to 0.05.
>
> The documentation I've read suggests that the analyze always operates on
> the entire table and is not incremental. Given that supposition are
> there ways to control cost(especially CPU) of the autoanalyze operation?
> Would a more aggressive autoanalyze scale factor (0.01) help. With the
> current scale factor we see an autoanalyze once a week, query
> performance has been acceptable so far, which could imply that scale
> factor could be increased if necessary.
>

No, reducing the scale factor to 0.01 will not help at all, it will
actually make the issue worse. The only thing autoanalyze does is
running ANALYZE, which *always* collects a fixed-size sample. Making it
more frequent will not reduce the amount of work done on each run.

So the first question is if you are not using the default (0.1), i.e.
have you reduced it to 0.05.

The other question is why it's so CPU-intensive. Are you using the
default statistics_target value (100), or have you increased that too?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Autoanalyze CPU usage

От

Habib Nahas

Дата:

20 декабря 2017 г., 04:53:59

As it happens our larger tables operate as a business log and are also insert only.

- There is no partitioning at this time since we expect to have an automated process to delete rows older than a certain date.

- Analyzing doing off-hours sounds like a good idea; if there is no other way to determine effect on db we may end up doing that.

- We have an open schema and heavily depend on jsonb, so I'm not sure if increasing the statistics target will be helpful.

Thanks

On Tue, Dec 19, 2017 at 2:03 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 12/19/2017 05:47 PM, Habib Nahas wrote:
> Hi,
>
> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> 100%. These spikes appear to be due to autoanalyze kicking on our larger
> tables.
>
> Our largest table has 75 million rows and the autoanalyze scale factor
> is set to 0.05.
>
> The documentation I've read suggests that the analyze always operates on
> the entire table and is not incremental. Given that supposition are
> there ways to control cost(especially CPU) of the autoanalyze operation?
> Would a more aggressive autoanalyze scale factor (0.01) help. With the
> current scale factor we see an autoanalyze once a week, query
> performance has been acceptable so far, which could imply that scale
> factor could be increased if necessary.
>

No, reducing the scale factor to 0.01 will not help at all, it will
actually make the issue worse. The only thing autoanalyze does is
running ANALYZE, which *always* collects a fixed-size sample. Making it
more frequent will not reduce the amount of work done on each run.

So the first question is if you are not using the default (0.1), i.e.
have you reduced it to 0.05.

The other question is why it's so CPU-intensive. Are you using the
default statistics_target value (100), or have you increased that too?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Autoanalyze CPU usage

От

Justin Pryzby

Дата:

20 декабря 2017 г., 04:55:57

On Tue, Dec 19, 2017 at 02:37:18PM -0800, Habib Nahas wrote:
> As it happens our larger tables operate as a business log and are also
> insert only.
> 
> - There is no partitioning at this time since we expect to have an
> automated process to delete rows older than a certain date.

This is a primary use case for partitioning ; bulk DROP rather than DELETE.

> - Analyzing doing off-hours sounds like a good idea; if there is no other
> way to determine effect on db we may end up doing that.

You can also implement a manual analyze job and hope to avoid autoanalyze.

> - We have an open schema and heavily depend on jsonb, so I'm not sure if
> increasing the statistics target will be helpful.

If the increased stats target isn't useful for that, I would recommend to
decrease it.

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581

Re: Autoanalyze CPU usage

От

"Michaeldba@sqlexec.com"

Дата:

20 декабря 2017 г., 07:10:51

Perhaps consider running manual vacuum analyze at low load times daily if you have that opportunity. This may stop
autovacuumsfrom hitting thresholds during high load times or do the normal/aggressive autovacuum tuning to make it more
aggressiveduring low load times and less aggressive during high load times. 

Sent from my iPad

> On Dec 19, 2017, at 5:03 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
>
>
>> On 12/19/2017 05:47 PM, Habib Nahas wrote:
>> Hi,
>>
>> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
>> 100%. These spikes appear to be due to autoanalyze kicking on our larger
>> tables.
>>
>> Our largest table has 75 million rows and the autoanalyze scale factor
>> is set to 0.05.
>>
>> The documentation I've read suggests that the analyze always operates on
>> the entire table and is not incremental. Given that supposition are
>> there ways to control cost(especially CPU) of the autoanalyze operation?
>> Would a more aggressive autoanalyze scale factor (0.01) help. With the
>> current scale factor we see an autoanalyze once a week, query
>> performance has been acceptable so far, which could imply that scale
>> factor could be increased if necessary.
>>
>
> No, reducing the scale factor to 0.01 will not help at all, it will
> actually make the issue worse. The only thing autoanalyze does is
> running ANALYZE, which *always* collects a fixed-size sample. Making it
> more frequent will not reduce the amount of work done on each run.
>
> So the first question is if you are not using the default (0.1), i.e.
> have you reduced it to 0.05.
>
> The other question is why it's so CPU-intensive. Are you using the
> default statistics_target value (100), or have you increased that too?
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: Autoanalyze CPU usage

От

Laurenz Albe

Дата:

20 декабря 2017 г., 13:15:39

Habib Nahas wrote:
> The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the table
> shows a time of 12:49; last_autovacuum does not show any activity around
> this time for any table. Checkpoint logs are also normal around this time.
> I'd like to understand if there are any other sources of activity I
> should be checking for that would account for the spike.

last_autoanalyze is set after autoanalyze is done, so that would suggest
that autoanalyze is not the problem.

It can be tough to figure out where the activity is coming from unless
cou can catch it in the act.  You could log all statements (though the amount
of log may be prohibitive and can cripple performance), you could log
just long running statements in the hope that these are at fault, you
could log connections and disconnections and hope to find the problem
that way.  Maybe logging your applications can help too.

Yours,
Laurenz Albe

Re: Autoanalyze CPU usage

От

Nikolay Samokhvalov

Дата:

20 декабря 2017 г., 14:24:09

On Tue, Dec 19, 2017 at 7:47 PM, Habib Nahas <habibnahas@gmail.com> wrote:

Hi,

We operate an RDS postgres 9.5 instance and have periodic CPU spikes to 100%. These spikes appear to be due to autoanalyze kicking on our larger tables.

How did you draw such conclusion? How did you find that autoanalyze is the reason of CPU spikes?

Re: Autoanalyze CPU usage

От

Habib Nahas

Дата:

20 декабря 2017 г., 20:38:59

Thanks for confirming that it is the end timestamp, the doc wasn't quite clear if it was the start or end.

There is a gap in our monitoring that makes diagnosis of such events very difficult after the fact. Something like a 10-sec periodic dump of pg_stat_activity along with a similar dump of pg_top would have been very helpful here.

-Habib

On Tue, Dec 19, 2017 at 11:15 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Habib Nahas wrote:
> The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the table
> shows a time of 12:49; last_autovacuum does not show any activity around
> this time for any table. Checkpoint logs are also normal around this time.
> I'd like to understand if there are any other sources of activity I
> should be checking for that would account for the spike.

last_autoanalyze is set after autoanalyze is done, so that would suggest
that autoanalyze is not the problem.

It can be tough to figure out where the activity is coming from unless
cou can catch it in the act. You could log all statements (though the amount
of log may be prohibitive and can cripple performance), you could log
just long running statements in the hope that these are at fault, you
could log connections and disconnections and hope to find the problem
that way. Maybe logging your applications can help too.

Yours,
Laurenz Albe

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Autoanalyze CPU usage