Excerpts from kuopo's message of dom nov 21 11:15:52 -0300 2010:
> In my experiment, I need about 1~3 min to finish the analyze operation
> on the big table (which depends on the value of vacuum_cost_delay). I
> am not surprised because this table is a really big one (now, it has
> over 200M records).
Okay. You may want to consider lowering the statistics size for all the
column in that table; that would reduce analyze time, at the cost of
possibly worsening the plans for that table, depending on how irregular
the distribution is. See ALTER TABLE / SET STATISTICS in the
documentation, and the default_statistics_target parameter in
postgresql.conf.
> However, the most of my concerns is the behavior of analyze/vacuum.
> You mentioned that the analyze-only operation cannot be optimized as
> the same way on optimizing vacuum. Does that mean the analyze
> operation on a table would unavoidably affect the vacuum proceeded on
> another one?
That's correct. I think you can run VACUUM ANALYZE, and it would do
both things at once; AFAIK this is also optimized like VACUUM is, but I
admit I'm not 100% sure (and I can't check right now).
> If this is a normal reaction for an analyze operation,
> maybe I should try to lower vacuum_cost_delay or use more powerful
> hardware to minimize the interfered period. So, the pages for the
> small table would not increase quickly.
I think it would make sense to have as low a cost_delay as possible for
this ANALYZE. (Note you can change it locally with a SET command; no
need to touch postgresql.conf. So you can change it when you analyze
just this large table).
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support