Обсуждение: autovacuum blocks the operations of other manual vacuum

Поиск
Список
Период
Сортировка

autovacuum blocks the operations of other manual vacuum

От
kuopo
Дата:
Hi,

I have a question about the behavior of autovacuum. When I have a big
table A which is being processed by autovacuum, I also manually use
(full) vacuum to clean another table B. Then I found that I always got
something like “found 0 removable, 14283 nonremovable row”. However,
if I stop the autovacuum functionality and use vacuum on that big
table A manually, I can clean table B (ex. found 22615 removable, 2049
nonremovable row).

Is this correct? Why do vacuum and autovacuum have different actions?

Ps. My postgreSQL is 8.4.

Re: autovacuum blocks the operations of other manual vacuum

От
Alvaro Herrera
Дата:
Excerpts from kuopo's message of vie nov 12 05:01:24 -0300 2010:
> Hi,
>
> I have a question about the behavior of autovacuum. When I have a big
> table A which is being processed by autovacuum, I also manually use
> (full) vacuum to clean another table B. Then I found that I always got
> something like “found 0 removable, 14283 nonremovable row”. However,
> if I stop the autovacuum functionality and use vacuum on that big
> table A manually, I can clean table B (ex. found 22615 removable, 2049
> nonremovable row).
>
> Is this correct? Why do vacuum and autovacuum have different actions?

Vacuum full does not assume that it can clean up tuples while other
transactions are running, and that includes the (non full, or "lazy")
vacuum that autovacuum is running.  Autovacuum only runs lazy vacuum;
and that one is aware that other concurrent vacuums can be ignored.

Just don't use vacuum full unless strictly necessary.  It has other
drawbacks.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum blocks the operations of other manual vacuum

От
kuopo
Дата:
Hi,

Thanks for your response. I've checked it again and found that the
main cause is the execution of ANALYZE. As I have mentioned, I have
two tables: table A is a big one (around 10M~100M records) for log
data and table B is a small one (around 1k records) for keeping some
current status. There are a lot of update operations and some search
operations on the table B. For the performance issue, I would like to
keep table B as compact as possible. According your suggestion, I try
to invoke standard vacuum (not full) more frequently (e.g., once per
min).

However, when I analyze the table A, the autovacuum or vacuum on the
table B cannot find any removable row version (the number of
nonremoveable row versions and pages keeps increasing). After the
analysis finishes, the search operations on the table B is still
inefficient. If I call full vacuum right now, then I can have quick
response time of the search operations on the table B again.

Any suggestions for this situation?


On Tue, Nov 16, 2010 at 11:26 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from kuopo's message of vie nov 12 05:01:24 -0300 2010:
>> Hi,
>>
>> I have a question about the behavior of autovacuum. When I have a big
>> table A which is being processed by autovacuum, I also manually use
>> (full) vacuum to clean another table B. Then I found that I always got
>> something like “found 0 removable, 14283 nonremovable row”. However,
>> if I stop the autovacuum functionality and use vacuum on that big
>> table A manually, I can clean table B (ex. found 22615 removable, 2049
>> nonremovable row).
>>
>> Is this correct? Why do vacuum and autovacuum have different actions?
>
> Vacuum full does not assume that it can clean up tuples while other
> transactions are running, and that includes the (non full, or "lazy")
> vacuum that autovacuum is running.  Autovacuum only runs lazy vacuum;
> and that one is aware that other concurrent vacuums can be ignored.
>
> Just don't use vacuum full unless strictly necessary.  It has other
> drawbacks.
>
> --
> Álvaro Herrera <alvherre@commandprompt.com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

Re: autovacuum blocks the operations of other manual vacuum

От
Alvaro Herrera
Дата:
Excerpts from kuopo's message of jue nov 18 04:10:36 -0300 2010:
> Hi,
>
> Thanks for your response. I've checked it again and found that the
> main cause is the execution of ANALYZE. As I have mentioned, I have
> two tables: table A is a big one (around 10M~100M records) for log
> data and table B is a small one (around 1k records) for keeping some
> current status. There are a lot of update operations and some search
> operations on the table B. For the performance issue, I would like to
> keep table B as compact as possible. According your suggestion, I try
> to invoke standard vacuum (not full) more frequently (e.g., once per
> min).
>
> However, when I analyze the table A, the autovacuum or vacuum on the
> table B cannot find any removable row version (the number of
> nonremoveable row versions and pages keeps increasing). After the
> analysis finishes, the search operations on the table B is still
> inefficient. If I call full vacuum right now, then I can have quick
> response time of the search operations on the table B again.

Hmm, I don't think we can optimize the analyze-only operation the same
way we optimize vacuum (i.e. allow vacuum to proceed while it's in
progress).  Normally analyze shouldn't take all that long anyway -- why
is it that slow?  Are you calling it in a transaction that also does
other stuff?  Are you analyzing more than one table in a single
transaction, perhaps even the whole database?

Perhaps you could speed it up by lowering vacuum_cost_delay, if it's set
to a nonzero value.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum blocks the operations of other manual vacuum

От
tv@fuzzy.cz
Дата:
> Excerpts from kuopo's message of jue nov 18 04:10:36 -0300 2010:
>> However, when I analyze the table A, the autovacuum or vacuum on the
>> table B cannot find any removable row version (the number of
>> nonremoveable row versions and pages keeps increasing). After the
>> analysis finishes, the search operations on the table B is still
>> inefficient. If I call full vacuum right now, then I can have quick
>> response time of the search operations on the table B again.

Hi, I don't know how to fix the long VACUUM/ANALYZE, but have you tried to
minimize the growth using HOT?

HOT means that if you update only columns that are not indexed, and if the
update can fit into the same page (into an update chain), this would not
create a dead row.

Are there any indexes on the small table? How large is it? You've
mentioned there are about 2049 rows - that might be just a few pages so
the indexes would not be very efficient anyway.

Try to remove the indexes, and maybe create the table with a smaller
fillfactor (so that there is more space for the updates).

That should be much more efficient and the table should not grow.

You can see if HOT works through pg_stat_all_tables view (columns
n_tup_upd and n_tup_hot_upd).

regards
Tomas


Re: autovacuum blocks the operations of other manual vacuum

От
Alvaro Herrera
Дата:
Excerpts from kuopo's message of dom nov 21 11:15:52 -0300 2010:
> In my experiment, I need about 1~3 min to finish the analyze operation
> on the big table (which depends on the value of vacuum_cost_delay). I
> am not surprised because this table is a really big one (now, it has
> over 200M records).

Okay.  You may want to consider lowering the statistics size for all the
column in that table; that would reduce analyze time, at the cost of
possibly worsening the plans for that table, depending on how irregular
the distribution is.  See ALTER TABLE / SET STATISTICS in the
documentation, and the default_statistics_target parameter in
postgresql.conf.

> However, the most of my concerns is the behavior of analyze/vacuum.
> You mentioned that the analyze-only operation cannot be optimized as
> the same way on optimizing vacuum. Does that mean the analyze
> operation on a table would unavoidably affect the vacuum proceeded on
> another one?

That's correct.  I think you can run VACUUM ANALYZE, and it would do
both things at once; AFAIK this is also optimized like VACUUM is, but I
admit I'm not 100% sure (and I can't check right now).

> If this is a normal reaction for an analyze operation,
> maybe I should try to lower vacuum_cost_delay or use more powerful
> hardware to minimize the interfered period. So, the pages for the
> small table would not increase quickly.

I think it would make sense to have as low a cost_delay as possible for
this ANALYZE.  (Note you can change it locally with a SET command; no
need to touch postgresql.conf.  So you can change it when you analyze
just this large table).

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum blocks the operations of other manual vacuum

От
kuopo
Дата:
In my experiment, I need about 1~3 min to finish the analyze operation
on the big table (which depends on the value of vacuum_cost_delay). I
am not surprised because this table is a really big one (now, it has
over 200M records).

However, the most of my concerns is the behavior of analyze/vacuum.
You mentioned that the analyze-only operation cannot be optimized as
the same way on optimizing vacuum. Does that mean the analyze
operation on a table would unavoidably affect the vacuum proceeded on
another one? If this is a normal reaction for an analyze operation,
maybe I should try to lower vacuum_cost_delay or use more powerful
hardware to minimize the interfered period. So, the pages for the
small table would not increase quickly.

Do you have any suggestion? Thanks!!


On Sat, Nov 20, 2010 at 9:49 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from kuopo's message of jue nov 18 04:10:36 -0300 2010:
>> Hi,
>>
>> Thanks for your response. I've checked it again and found that the
>> main cause is the execution of ANALYZE. As I have mentioned, I have
>> two tables: table A is a big one (around 10M~100M records) for log
>> data and table B is a small one (around 1k records) for keeping some
>> current status. There are a lot of update operations and some search
>> operations on the table B. For the performance issue, I would like to
>> keep table B as compact as possible. According your suggestion, I try
>> to invoke standard vacuum (not full) more frequently (e.g., once per
>> min).
>>
>> However, when I analyze the table A, the autovacuum or vacuum on the
>> table B cannot find any removable row version (the number of
>> nonremoveable row versions and pages keeps increasing). After the
>> analysis finishes, the search operations on the table B is still
>> inefficient. If I call full vacuum right now, then I can have quick
>> response time of the search operations on the table B again.
>
> Hmm, I don't think we can optimize the analyze-only operation the same
> way we optimize vacuum (i.e. allow vacuum to proceed while it's in
> progress).  Normally analyze shouldn't take all that long anyway -- why
> is it that slow?  Are you calling it in a transaction that also does
> other stuff?  Are you analyzing more than one table in a single
> transaction, perhaps even the whole database?
>
> Perhaps you could speed it up by lowering vacuum_cost_delay, if it's set
> to a nonzero value.
>
> --
> Álvaro Herrera <alvherre@commandprompt.com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

Re: autovacuum blocks the operations of other manual vacuum

От
kuopo
Дата:
Thanks for your information. I am using postgresql 8.4 and this
version should have already supported HOT. The frequently updated
columns are not indexed columns. So, the frequent updates should not
create many dead records. I also did a small test. If I don't execute
vacuum, the number of pages of the small table does not increase.

However, analyzing the big table still bothers me. According current
results, if the analyze operation is triggered, vacuum or HOT would
not function as I expect.


On Sat, Nov 20, 2010 at 12:43 PM,  <tv@fuzzy.cz> wrote:
>> Excerpts from kuopo's message of jue nov 18 04:10:36 -0300 2010:
>>> However, when I analyze the table A, the autovacuum or vacuum on the
>>> table B cannot find any removable row version (the number of
>>> nonremoveable row versions and pages keeps increasing). After the
>>> analysis finishes, the search operations on the table B is still
>>> inefficient. If I call full vacuum right now, then I can have quick
>>> response time of the search operations on the table B again.
>
> Hi, I don't know how to fix the long VACUUM/ANALYZE, but have you tried to
> minimize the growth using HOT?
>
> HOT means that if you update only columns that are not indexed, and if the
> update can fit into the same page (into an update chain), this would not
> create a dead row.
>
> Are there any indexes on the small table? How large is it? You've
> mentioned there are about 2049 rows - that might be just a few pages so
> the indexes would not be very efficient anyway.
>
> Try to remove the indexes, and maybe create the table with a smaller
> fillfactor (so that there is more space for the updates).
>
> That should be much more efficient and the table should not grow.
>
> You can see if HOT works through pg_stat_all_tables view (columns
> n_tup_upd and n_tup_hot_upd).
>
> regards
> Tomas
>
>