Обсуждение: AUTOVACUUM after purging a lot of records

Поиск
Список
Период
Сортировка

AUTOVACUUM after purging a lot of records

От
Jeni Fifrick
Дата:

Dear Postgres gurus,

 

Yesterday, we run a purging process to purge a big number of records from one table (transaction_log).

The purging process run several times  between 3 PM to 5 PM, and another one scheduled at 2 AM..

No of records was purge from 63,350,117 to  6,854,768 .

 

The “autovacuum : VACUUM ANALYZE” was automatically started  at 16:15.

While the autovacuum running, there are several purging process run.

The autovacuum process is still running now.(so, it’s been running for about 23 hours).

This is from pg_stat_activity:

datid |   datname   | procpid | usesysid |   usename   |     application_name     | client_addr  | client_hostname | client_port |         backend

_start         |          xact_start           |          query_start          | waiting |                   current_query

-------+-------------+---------+----------+-------------+--------------------------+--------------+-----------------+-------------+----------------

---------------+-------------------------------+-------------------------------+---------+---------------------------------------------------

16390 | tms         |   22135 |       10 | postgres    |                          |              |                 |             | 2013-12-12 16:1

5:09.413137-05 | 2013-12-12 16:15:09.435742-05 | 2013-12-12 16:15:09.435742-05 | f       | autovacuum: VACUUM ANALYZE public.transaction_log

 

 

My questions:

1.       Is it normal for vacumm to run that long (23 hour)? Is there a way to verify whether the vacuum process is running properly and making any progress?
If NOT, what should I do?

2.       Can I increase the maintenance_work_mem parameter to make the existing process faster? Will it affect the existing process?
(I believe this parameter can be changed with reload command only?)

3.       Do you think it’s better to do VACUUM FULL, consider the amount of records deleted?

4.       Any other input/recommendation you can give us?

We’re using Vpostgres version 9.1.3 on CentOS 6.3. 12 cores and 64 GB memory.

But, we’re using default setup.

Thank you in advance for your help. Please let me know if you need any other information.

 

Thank you,

Jeni

I’m new to Postgres..  :)

 

Re: AUTOVACUUM after purging a lot of records

От
Kevin Grittner
Дата:
Jeni Fifrick <jfifrick@incomm.com> wrote:

> Is it normal for vacumm to run that long (23 hour)?

It can be.  The autovacuum workers are "paced" so that their
resource usage is limited.  This spreads out the work, to minimize
impact on foreground processes.  In some cases the default
configuration is not aggressive enough to keep up overall, but that
should be what you look at, not whether a particular worker takes
days on a particular table -- that in itself doesn't indicate a
problem.

> Is there a way to verify whether the vacuum process is running
> properly and making any progress?

Use OS tools (like top or ps) to check whether the process is still
consuming resources.

> Can I increase the maintenance_work_mem parameter to make the
> existing process faster?

That is one of many settings which are often tuned.  If you really
want to cause that table to finish quickly, you could just start a
VACUUM ANALYZE command for that table.  The autovacuum run will
cancel to make room for it, and the command default to running full
speed rather than pacing its work.  You could set
maintenance_work_mem or cost-based pacing for the command in the
session before running the command to optimize it or pace it.

> Do you think it’s better to do VACUUM FULL, consider the amount
> of records deleted?

That depends on how much data is left and whether you can afford to
have the table exclusively locked while its data is copied and
re-indexed.

> Any other input/recommendation you can give us?

If you expect to purge sets of data like that, and you know what
purge group a row belongs in at the time it is inserted, you should
probably look at partitioning.  There's no faster way to drop
millions of rows from a table than to drop the partition they are
in.

> We’re using Vpostgres version 9.1.3

If you want to stay with 9.1 for now, you should upgrade to the
latest minor release.

http://www.postgresql.org/support/versioning/

> on CentOS 6.3. 12 cores and 64 GB memory.
> But, we’re using default setup.

With a machine that size, you definitely should tune several
settings.  The defaults are geared more toward being able to
"install and go" on a low-end laptop.  Serious servers need tuning.

If you really are at the default configuration (including
max_connections), the following would almost certainly be
improvements:

shared_buffers = 8GB
effective_cache_size = 32GB
work_mem = 160MB
maintenance_work_mem = 2GB
wal_buffers = 32MB
checkpoint_completion_target = 0.9
checkpoint_segments = 64
autovacuum_vacuum_cost_limit = 800
cpu_tuple_cost = 0.03

Depending on the size of your active data set, you may also want to
reduce random_page_cost; but you didn't provide enough information
to know that.  That's hardly a complete list, but it should help
with performance on most servers and workloads with the RAM and
cores you describe.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: AUTOVACUUM after purging a lot of records

От
Jeni Fifrick
Дата:

Hi Kevin,

Thank you very much for the input.

 

·         I think, I'll let autovaccum run and check it tomorrow.

So, what you're saying is if I run the "VACUUM ANALYZE" while the autovacuum still running, the autovacuum will be cancelled, right? And, I need to set the maintenance_work_mem in the session before executing the 'VACUUM ANALYZE transactionlog'. Is this a correct syntax?

 

Regarding the configuration, all the memory related are still with default value.

We did change the max_connections to be 2000. BUT, so far, our max connection is around 500.

 

We’re planning to do upgrade early next year. Since this is holiday season (and we’re on business peak during this time), so management is really careful in making any changes, EXCEPT absolutely necessary (or break fix).

 

Thanks again,

Jeni

 

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn@ymail.com]
Sent: Friday, December 13, 2013 4:24 PM
To: Jeni Fifrick; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] AUTOVACUUM after purging a lot of records

 

Jeni Fifrick <jfifrick@incomm.com> wrote:

 

> Is it normal for vacumm to run that long (23 hour)?

 

It can be.  The autovacuum workers are "paced" so that their resource usage is limited.  This spreads out the work, to minimize impact on foreground processes.  In some cases the default configuration is not aggressive enough to keep up overall, but that should be what you look at, not whether a particular worker takes days on a particular table -- that in itself doesn't indicate a problem.

 

> Is there a way to verify whether the vacuum process is running

> properly and making any progress?

 

Use OS tools (like top or ps) to check whether the process is still consuming resources.

 

> Can I increase the maintenance_work_mem parameter to make the existing

> process faster?

 

That is one of many settings which are often tuned.  If you really want to cause that table to finish quickly, you could just start a VACUUM ANALYZE command for that table.  The autovacuum run will cancel to make room for it, and the command default to running full speed rather than pacing its work.  You could set maintenance_work_mem or cost-based pacing for the command in the session before running the command to optimize it or pace it.

 

> Do you think it’s better to do VACUUM FULL, consider the amount of

> records deleted?

 

That depends on how much data is left and whether you can afford to have the table exclusively locked while its data is copied and re-indexed.

 

> Any other input/recommendation you can give us?

 

If you expect to purge sets of data like that, and you know what purge group a row belongs in at the time it is inserted, you should probably look at partitioning.  There's no faster way to drop millions of rows from a table than to drop the partition they are in.

 

> We’re using Vpostgres version 9.1.3

 

If you want to stay with 9.1 for now, you should upgrade to the latest minor release.

 

http://www.postgresql.org/support/versioning/

 

> on CentOS 6.3. 12 cores and 64 GB memory.

> But, we’re using default setup.

 

With a machine that size, you definitely should tune several settings.  The defaults are geared more toward being able to "install and go" on a low-end laptop.  Serious servers need tuning.

 

If you really are at the default configuration (including max_connections), the following would almost certainly be

improvements:

 

shared_buffers = 8GB

effective_cache_size = 32GB

work_mem = 160MB

maintenance_work_mem = 2GB

wal_buffers = 32MB

checkpoint_completion_target = 0.9

checkpoint_segments = 64

autovacuum_vacuum_cost_limit = 800

cpu_tuple_cost = 0.03

 

Depending on the size of your active data set, you may also want to reduce random_page_cost; but you didn't provide enough information to know that.  That's hardly a complete list, but it should help with performance on most servers and workloads with the RAM and cores you describe.

 

--

Kevin Grittner

EDB: http://www.enterprisedb.com

The Enterprise PostgreSQL Company

 

Re: AUTOVACUUM after purging a lot of records

От
Kevin Grittner
Дата:
Jeni Fifrick <jfifrick@incomm.com> wrote:

> I think, I'll let autovaccum run and check it tomorrow.

Seems reasonable.

> So, what you're saying is if I run the "VACUUM ANALYZE" while the
> autovacuum still running, the autovacuum will be cancelled, right?

Yes.

> And, I need to set the maintenance_work_mem in the session before
> executing the 'VACUUM ANALYZE transactionlog'. Is this a correct
> syntax?

You might want to connect with psql and issue these statements:

SET maintenance_work_mem = '2GB';
VACUUM ANALYZE transactionlog;

> Regarding the configuration, all the memory related are still
> with default value.

> We did change the max_connections to be 2000. BUT, so far, our
> max connection is around 500.

Well, work_mem is the limit on memory used by each node (processing
step) of every query that is running.  A good rule of thumb is to
assume one such allocation per allowed connection, and keep that to
25% of machine RAM.  So 64GB * 0.25 / 2000 would be 8MB.  If you
could drop max connections, you could boost work_mem proportionally
without risking blowing out all of your OS cache (or even
swapping).

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company