Обсуждение: vacuumdb vs. max_connections: SELECT waiting

Поиск
Список
Период
Сортировка

vacuumdb vs. max_connections: SELECT waiting

От
"Thomas F. O'Connell"
Дата:
Tonight as part of a scheduled maintenance operation, I was going to
perform a VACUUM FULL ANALYZE on a postgres 8.1.3 instance that had
just had it's FSM settings increased to account for about 2 years'
worth of growth (particularly in number of relations).

Shortly after I kicked it off, I watched the number of connections
trend upward as a result of the aggressive locking of FULL. I didn't
want to let this continue without notifying the developers about a
potential downtime for their application, so I killed the vacuumdb
process I had started, figuring that the existing connections would
go right through.

Unfortunately, my expectations were not met, and I wound up with a
bunch of connections in a waiting state. It seems as though the
connections that were waiting for the VACUUM locks were still waiting
even once the VACUUM was killed. Is that expected behavior?
Eventually, a timeout threshold must've been hit because everything
went back to normal, but it was not immediate. statement_timeout is
set to 0, so it wasn't that.

This postmaster does have autovacuum enabled. Could that have
interfered in any way?

On the one hand I'm curious about the behavior of postgres in this
scenario. On the other hand, though, I'm wondering if this operation
is necessary. The postmaster has been restarted such that the new FSM
settings are in effect. Can autovacuum recover for the months where
the FSM settings were not sufficient to cover the number of relations
in this cluster? Is a vacuumdb/reindexdb cycle necessary to reclaim
disk space?

--
Thomas F. O'Connell
Database Architecture and Programming
Sitening, LLC

http://www.sitening.com/
3004 B Poston Avenue
Nashville, TN 37203-1314
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)


Re: vacuumdb vs. max_connections: SELECT waiting

От
Tom Lane
Дата:
"Thomas F. O'Connell" <tfo@sitening.com> writes:
> Shortly after I kicked it off, I watched the number of connections
> trend upward as a result of the aggressive locking of FULL. I didn't
> want to let this continue without notifying the developers about a
> potential downtime for their application, so I killed the vacuumdb
> process I had started, figuring that the existing connections would
> go right through.

Killing the vacuumdb client process wouldn't automatically abort the
current operation of the connected backend.  If that's all you did,
it would've finished out the VACUUM command, ie, essentially no effect.
Sending a SIGINT to the backend to cancel the current command would've
been the right thing to do.

> Can autovacuum recover for the months where
> the FSM settings were not sufficient to cover the number of relations
> in this cluster?

Probably not.  But VACUUM FULL isn't really a great choice either;
it will certainly do nothing at all for index bloat.  You might try
CLUSTER instead if you need to aggressively recover space.

            regards, tom lane

Re: vacuumdb vs. max_connections: SELECT waiting

От
"Thomas F. O'Connell"
Дата:
On May 13, 2006, at 12:35 AM, Tom Lane wrote:

> VACUUM FULL does all right at packing the table (except in
> pathological
> cases, eg a very large tuple near the end of the table).  It mostly
> bites as far as shrinking indexes goes, however.  If you've got a
> serious index bloat problem then REINDEX is the only solution.
> CLUSTER
> does an automatic REINDEX after compacting the table --- it doesn't
> have
> any special properties as far as the index space goes.  However, if
> you've got serious table bloat then CLUSTER will probably be quicker
> than VACUUM FULL.

So my reading of VACUUM VERBOSE output leads me to believe that I
could actually interpret both table and index bloat for the entire
database from it (and suggests a useful reporting tool...).

This is from the docs:

INFO:  index "onek_unique1" now contains 1000 tuples in 14 pages
DETAIL:  3000 index tuples were removed.
0 index pages have been deleted, 0 are currently reusable.

Does this imply that 75% of the tuples in this index were free space?
Even so, since this is an index, the pages aren't shrunk, per your
note above, and a REINDEX would be required for reclamation, right?

And, then, later:

INFO:  "onek": removed 3000 tuples in 108 pages
DETAIL:  CPU 0.01s/0.06u sec elapsed 0.07 sec.
INFO:  "onek": found 3000 removable, 1000 nonremovable tuples in 143
pages
DETAIL:  0 dead tuples cannot be removed yet.

Which implies that this table was also 75% bloated? And a VACUUM FULL
(or CLUSTER) could improve this even more than the VACUUM VERBOSE
ANALYZE from the example in the docs?

>> What is the likely performance impact of having database (table or
>> index) bloat from several months in a > 100 GB database with tens of
>> thousands of relations of wildly varying sizes and insufficient FSM
>> settings? If autovacuum is keeping up with statistics for index
>> usage, is the only potential impact related to disk usage (in basic
>> filesystem terms)?
>
> Unless you have a lot of seqscan-using queries, there's no particular
> reason to panic over file bloat that I can see.  It's just a matter of
> how close you are to running out of disk space ...

Unfortunately, there are several seqscan-using queries and several
large tables (with the largest currently approaching 13% of physical
memory).

I'm actually trying to prioritize administrative operations that
could result in noticeable performance gains. If compacting tables
and indexes turns out to be low on the pole in terms of performance
considerations, then I'm inclined to look elsewhere, especially
considering the administrative headache (from the perspective of the
related application) required to undertake a database-wide CLUSTER or
VACUUM FULL + REINDEX.

--
Thomas F. O'Connell
Database Architecture and Programming
Sitening, LLC

http://www.sitening.com/
3004 B Poston Avenue
Nashville, TN 37203-1314
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)

Re: vacuumdb vs. max_connections: SELECT waiting

От
"Jim C. Nasby"
Дата:
On Sat, May 13, 2006 at 03:47:00AM -0500, Thomas F. O'Connell wrote:
>
> On May 13, 2006, at 12:35 AM, Tom Lane wrote:
>
> >VACUUM FULL does all right at packing the table (except in
> >pathological
> >cases, eg a very large tuple near the end of the table).  It mostly
> >bites as far as shrinking indexes goes, however.  If you've got a
> >serious index bloat problem then REINDEX is the only solution.
> >CLUSTER
> >does an automatic REINDEX after compacting the table --- it doesn't
> >have
> >any special properties as far as the index space goes.  However, if
> >you've got serious table bloat then CLUSTER will probably be quicker
> >than VACUUM FULL.
>
> So my reading of VACUUM VERBOSE output leads me to believe that I
> could actually interpret both table and index bloat for the entire
> database from it (and suggests a useful reporting tool...).
>
> This is from the docs:
>
> INFO:  index "onek_unique1" now contains 1000 tuples in 14 pages
> DETAIL:  3000 index tuples were removed.
> 0 index pages have been deleted, 0 are currently reusable.
>
> Does this imply that 75% of the tuples in this index were free space?
> Even so, since this is an index, the pages aren't shrunk, per your
> note above, and a REINDEX would be required for reclamation, right?

Correct.

> And, then, later:
>
> INFO:  "onek": removed 3000 tuples in 108 pages
> DETAIL:  CPU 0.01s/0.06u sec elapsed 0.07 sec.
> INFO:  "onek": found 3000 removable, 1000 nonremovable tuples in 143
> pages
> DETAIL:  0 dead tuples cannot be removed yet.
>
> Which implies that this table was also 75% bloated? And a VACUUM FULL
> (or CLUSTER) could improve this even more than the VACUUM VERBOSE
> ANALYZE from the example in the docs?

Also correct. There are some tricks you can do to shrink the table
without resorting to a vacuum full or a cluster though, but it's a real
PITA.

Also, remember that if your database is growing it will eventually end
up reusing all that free space.

> Unfortunately, there are several seqscan-using queries and several
> large tables (with the largest currently approaching 13% of physical
> memory).

So are you saying that the entire table fits in memory? Does the entire
database fit in memory? If so, you're unlikely to find huge gains from
vacuuming, unless you're really pushing the system hard.

> I'm actually trying to prioritize administrative operations that
> could result in noticeable performance gains. If compacting tables
> and indexes turns out to be low on the pole in terms of performance
> considerations, then I'm inclined to look elsewhere, especially
> considering the administrative headache (from the perspective of the
> related application) required to undertake a database-wide CLUSTER or
> VACUUM FULL + REINDEX.

It depends. On badly bloated systems I've seen 20-50% improvements from
setting up an adequate vacuum plan.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461