Обсуждение: vacuuming template0

Поиск
Список
Период
Сортировка

vacuuming template0

От
Jeff Janes
Дата:
I have a stress test of the of the WAL replay which panics the
database over and over again to make sure it recovers correctly.

This is in 9.3dev.

The test was eventually freezing up because of wraparound.  The
problem was that, on fast enough hardware, the intentional crashes
were always happening before autovac could do its thing.

So I added a  periodic "bin/vacuumdb -a" command in a place where
crashes are inhibited.

It was still freezing eventually with: database is not accepting
commands to avoid wraparound data loss in database "template0"

I thought that template0 did not need vacuuming because everything in
it was frozen.  But it looks like it does need vacuuming, and no one
but autovac can connect in order to do that vacuum.

Is this a real problem?  Presumably no one systematically crashes
their database shortly after start up on a production system; but that
doesn't mean there are not other ways to get into the situation.  (I
can't think of any of them--that is is why I'm asking here)

I guess if it has been like this forever then it must not be a problem
or it would have been noticed.  But if this need to vacuum template0
arose recently, it could be a problem.  (Doing git bisect on
over-night runs is no fun, so if someone happens to know off the top
of their head...)

So, is this a real problem or purely a fantastical one, and does
anyone know how old it would be?

Cheers,

Jeff



Re: vacuuming template0

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> I thought that template0 did not need vacuuming because everything in
> it was frozen.  But it looks like it does need vacuuming, and no one
> but autovac can connect in order to do that vacuum.

Everything in it *should* be frozen, typically.  My recollection is that
we used to exclude it from autovacuuming, but decided to treat it the
same as every other database on the grounds that somebody might've
connected to it and modified something, then restored the normal
not-datallowconn marking.  (A valid reason for doing that would be to
apply some maintenance correction to the system catalogs, as we've
occasionally had to recommend in update release notes.)

> Is this a real problem?  Presumably no one systematically crashes
> their database shortly after start up on a production system; but that
> doesn't mean there are not other ways to get into the situation.

I can't get excited about this scenario.  Given that template0 should be
(a) small and (b) all-frozen already, it should not take a noticeable
amount of time for autovac to look through it once per freeze cycle.
If your database is under such stress that that can't get done, you've
got *serious* problems, probably much worse than whether template0
itself is getting processed.
        regards, tom lane