Обсуждение: Too frequent warnings for wraparound failure

Поиск
Список
Период
Сортировка

Too frequent warnings for wraparound failure

От
"Milen A. Radev"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(This is a repost because the original was send when I was not
subscribed to the list and it is still not approved - sorry if you
received it already.)


I have installed a script that executes vacuumdb for all DBs in my
cluster (run by cron every night):

BEGIN------------------------------------------------
#!/bin/sh
# cron script to vacuum pgsql databases
#

PG_VACUUM_LOG=/var/log/pg_vacuum

echo Starting at `date "+%Y-%m-%d %H:%M:%S"` >> $PG_VACUUM_LOG
exec /usr/local/pgsql/bin/vacuumdb -U postgres -h localhost \
- --all --analyze >>$PG_VACUUM_LOG 2>&1

END--------------------------------------------------


I review the log every morning. In the beginning I got "wraparound
failure" warnings every third day. But from a week I got those warnings
every day. Well we have one table in one database where there are a lot
of inserts, but not that many - around 30-40 thousand per day.


Any ideas what could be cause? And is it an symptom for some kind of
failure coming?



Here is the excerpt from the log (for today and yesterday):

Starting at 2005-03-07 02:25:04
vacuumdb: vacuuming database "xxx1"
WARNING:  some databases have not been vacuumed in 1774832854 transactions
HINT:  Better vacuum them within 372650793 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx2"
WARNING:  some databases have not been vacuumed in 1774828277 transactions
HINT:  Better vacuum them within 372655370 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx3"
WARNING:  some databases have not been vacuumed in 1774893888 transactions
HINT:  Better vacuum them within 372589759 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "template1"
WARNING:  some databases have not been vacuumed in 1774909701 transactions
HINT:  Better vacuum them within 372573946 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx4"
WARNING:  some databases have not been vacuumed in 1774907572 transactions
HINT:  Better vacuum them within 372576075 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx5"
WARNING:  some databases have not been vacuumed in 1774908235 transactions
HINT:  Better vacuum them within 372575412 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx6"
WARNING:  some databases have not been vacuumed in 1774903764 transactions
HINT:  Better vacuum them within 372579883 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx7"
WARNING:  some databases have not been vacuumed in 1774905224 transactions
HINT:  Better vacuum them within 372578423 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx8"
WARNING:  some databases have not been vacuumed in 1774904273 transactions
HINT:  Better vacuum them within 372579374 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx9"
WARNING:  some databases have not been vacuumed in 1774814127 transactions
HINT:  Better vacuum them within 372669520 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx10"
VACUUM

Starting at 2005-03-08 02:25:57
vacuumdb: vacuuming database "xxx1"
WARNING:  some databases have not been vacuumed in 1744400306 transactions
HINT:  Better vacuum them within 403083341 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx2"
WARNING:  some databases have not been vacuumed in 1744361223 transactions
HINT:  Better vacuum them within 403122424 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx3"
WARNING:  some databases have not been vacuumed in 1744427176 transactions
HINT:  Better vacuum them within 403056471 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "template1"
WARNING:  some databases have not been vacuumed in 1744450173 transactions
HINT:  Better vacuum them within 403033474 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx4"
WARNING:  some databases have not been vacuumed in 1744460006 transactions
HINT:  Better vacuum them within 403023641 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx5"
WARNING:  some databases have not been vacuumed in 1744461917 transactions
HINT:  Better vacuum them within 403021730 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx6"
WARNING:  some databases have not been vacuumed in 1744465388 transactions
HINT:  Better vacuum them within 403018259 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx7"
WARNING:  some databases have not been vacuumed in 1744466073 transactions
HINT:  Better vacuum them within 403017574 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx8"
WARNING:  some databases have not been vacuumed in 1744464553 transactions
HINT:  Better vacuum them within 403019094 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx9"
WARNING:  some databases have not been vacuumed in 1744489170 transactions
HINT:  Better vacuum them within 402994477 transactions, or you may have
a wraparound failure.
VACUUM
vacuumdb: vacuuming database "xxx10"
VACUUM



- --
Milen A. Radev

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCMBEHqGlhYx0/sboRAmmHAKCWvAXKRuX5k50WINymuvuniFh+VQCeIzw7
Cj0s6C7/2zindectk5/Clvk=
=f+3l
-----END PGP SIGNATURE-----

Re: Too frequent warnings for wraparound failure

От
Tom Lane
Дата:
"Milen A. Radev" <milen@securax.org> writes:
> I review the log every morning. In the beginning I got "wraparound
> failure" warnings every third day. But from a week I got those warnings
> every day. Well we have one table in one database where there are a lot
> of inserts, but not that many - around 30-40 thousand per day.

Are you really doing half a billion transactions a day?  Keep an eye on
age(pg_class.datfrozenxid) as suggested in the manual to see what's
going on.

It's hard to credit getting any useful work done at the rate of 10K
transactions per second, but if you have a client that is issuing a
constant stream of dummy operations (like begin/rollback) it's
conceivable that you could advance the transaction counter that fast.

            regards, tom lane

Re: Too frequent warnings for wraparound failure

От
Michael Fuhr
Дата:
On Thu, Mar 10, 2005 at 11:19:03AM +0200, Milen A. Radev wrote:

> exec /usr/local/pgsql/bin/vacuumdb -U postgres -h localhost \
> - --all --analyze >>$PG_VACUUM_LOG 2>&1

My first thought was that the lone hyphen at the beginning of the
script's last line might be causing an end of option processing,
but your output does show many databases being vacuumed (do all
databases appear in the real output?).  However, I tried running
that script with several versions of PostgreSQL and vacuumdb failed
with a usage error in every case.  Are you sure the script you
posted is the one that's running?  What version of PostgreSQL are
you using?  As Tom mentioned, what's the output of the following
query?

SELECT datname, age(datfrozenxid) FROM pg_database;

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Too frequent warnings for wraparound failure

От
Tom Lane
Дата:
I wrote:
> "Milen A. Radev" <milen@securax.org> writes:
>> I review the log every morning. In the beginning I got "wraparound
>> failure" warnings every third day. But from a week I got those warnings
>> every day. Well we have one table in one database where there are a lot
>> of inserts, but not that many - around 30-40 thousand per day.

> Are you really doing half a billion transactions a day?

I thought of another mechanism that wouldn't require such a preposterous
load, only half a billion transactions since initdb.  (How old is this
installation, anyway, and which PG version?)

If you are creating new databases every day and you do it by cloning
template0, then the new databases would come into existence with
datfrozenxid equal to template0's.  Once template0 is more than half a
billion transactions old, you'd start seeing the warning.

This is relatively harmless, but probably we should try to mask it.
We could make CREATE DATABASE set datfrozenxid to current time when
cloning a database that has datallowconn false, on the assumption that
the source DB is entirely frozen and so there's nothing to vacuum yet.

            regards, tom lane

Re: Too frequent warnings for wraparound failure

От
"Milen A. Radev"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Fuhr wrote:
> On Thu, Mar 10, 2005 at 11:19:03AM +0200, Milen A. Radev wrote:
>
>
>>exec /usr/local/pgsql/bin/vacuumdb -U postgres -h localhost \
>>- --all --analyze >>$PG_VACUUM_LOG 2>&1
>
>
> My first thought was that the lone hyphen at the beginning of the
> script's last line might be causing an end of option processing,

Well, that hyphen is inserted by Enigmail, so it's not part of the
actual script. Here it is again (hopefully not garbled):

BEGIN-------------------------------------------
#!/bin/sh
# cron script to vacuum pgsql databases
#

PG_VACUUM_LOG=/var/log/pg_vacuum

echo Starting at `date "+%Y-%m-%d %H:%M:%S"` >> $PG_VACUUM_LOG
exec /usr/local/pgsql/bin/vacuumdb -U postgres \
 -h localhost --all --analyze >>$PG_VACUUM_LOG 2>&1

END-------------------------------------------

> but your output does show many databases being vacuumed (do all
> databases appear in the real output?).  However, I tried running

Yes, they all appear in the log.

> that script with several versions of PostgreSQL and vacuumdb failed
> with a usage error in every case.  Are you sure the script you
> posted is the one that's running?  What version of PostgreSQL are
> you using?  As Tom mentioned, what's the output of the following

7.4.7 on Debian GNU/Linux 3.0 (Woody)[Linux xxxx 2.6.10 #1 SMP Thu Feb
10 18:52:12 EET 2005 i686 unknown]

> query?
>
> SELECT datname, age(datfrozenxid) FROM pg_database;
>


Two queries - 2 seconds apart:

template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
  datname  |     age
- -----------+-------------
 xxx1      |  1519763573
 xxx2      |  1519754033
 xxx3      |  1519592503
 template1 |  1517875932
 template0 | -1808458730
 xxx4      |  1517860620
 xxx5      |  1517786431
 xxx6      |  1517780441
 xxx7      |  1517762875
 xxx8      |  1517755876
 xxx9      |  1517749309
 xxx10     |  1516823602
(12 rows)

template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
  datname  |     age
- -----------+-------------
 xxx1      |  1519769620
 xxx2      |  1519760080
 xxx3      |  1519598550
 template1 |  1517881979
 template0 | -1808452683
 xxx4      |  1517866667
 xxx5      |  1517792478
 xxx6      |  1517786488
 xxx7      |  1517768922
 xxx8      |  1517761923
 xxx9      |  1517755356
 xxx10     |  1516829649
(12 rows)


- --
Milen A. Radev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCMHmzqGlhYx0/sboRAsEFAJ0RWe8XVtLWWiOOrt46Nwg9xBD4hgCePhP9
YyYfj2LsKUCIFeheS71iI44=
=TDVO
-----END PGP SIGNATURE-----

Re: Too frequent warnings for wraparound failure

От
"Milen A. Radev"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Lane wrote:
> I wrote:
>
>>"Milen A. Radev" <milen@securax.org> writes:
>>
>>>I review the log every morning. In the beginning I got "wraparound
>>>failure" warnings every third day. But from a week I got those warnings
>>>every day. Well we have one table in one database where there are a lot
>>>of inserts, but not that many - around 30-40 thousand per day.
>
>
>>Are you really doing half a billion transactions a day?
>
>
> I thought of another mechanism that wouldn't require such a preposterous
> load, only half a billion transactions since initdb.  (How old is this
> installation, anyway, and which PG version?)

It's created on February 16 and is 7.4.7 on Debian GNU/Linux 3.0 (Woody)
(kernel 2.6.10).

>
> If you are creating new databases every day and you do it by cloning
> template0, then the new databases would come into existence with
> datfrozenxid equal to template0's.  Once template0 is more than half a
> billion transactions old, you'd start seeing the warning.

No, we do not do anything like that. We imported all databases from a
backup created by using pg_dump.

>
> This is relatively harmless, but probably we should try to mask it.
> We could make CREATE DATABASE set datfrozenxid to current time when
> cloning a database that has datallowconn false, on the assumption that
> the source DB is entirely frozen and so there's nothing to vacuum yet.
>
>             regards, tom lane



- --
Milen A. Radev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCMHqzqGlhYx0/sboRAr1dAKCB4vGHvzwsQ9zsM20y3hLrOSfqsQCgqVai
8RrSaVHjsoktDriCwCRWjfc=
=ju16
-----END PGP SIGNATURE-----

Re: Too frequent warnings for wraparound failure

От
Tom Lane
Дата:
"Milen A. Radev" <milen@securax.org> writes:
> Two queries - 2 seconds apart:

> template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
>   datname  |     age
> - -----------+-------------
>  xxx1      |  1519763573

> template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
>   datname  |     age
> - -----------+-------------
>  xxx1      |  1519769620

So you managed to execute about 6000 transactions in those two seconds.
Doing what?

            regards, tom lane

Re: Too frequent warnings for wraparound failure

От
"Milen A. Radev"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Lane wrote:
> "Milen A. Radev" <milen@securax.org> writes:
>
>>Two queries - 2 seconds apart:
>
>
>>template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
>>  datname  |     age
>>- -----------+-------------
>> xxx1      |  1519763573
>
>
>>template1=# SELECT datname, age(datfrozenxid) FROM pg_database;
>>  datname  |     age
>>- -----------+-------------
>> xxx1      |  1519769620
>
>
> So you managed to execute about 6000 transactions in those two seconds.
> Doing what?
>
>             regards, tom lane


Well that could be the problem - I'm 99.99% positive we do not execute
so many transactions - may be 30-40 thousand per day and that's only for
one relatively busy DB. The example database ("xxx1") you've chosen is
not used at all. As most of the other databases in that cluster.

And another observation (closely related IMO): when I login on another
development server/cluster and execute the same query I get an increment
 for every database for every query execution. But there the version is
8.0.1.


- --
Milen A. Radev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCMW2YqGlhYx0/sboRAvO3AJ9PAaADiX2Ci9iChyNCpojFb5uvwgCgnJ99
mzTydVpBaMCp/hJjrUvuO5Q=
=OKGe
-----END PGP SIGNATURE-----

Re: Too frequent warnings for wraparound failure

От
Tom Lane
Дата:
"Milen A. Radev" <milen@securax.org> writes:
> Tom Lane wrote:
>> So you managed to execute about 6000 transactions in those two seconds.
>> Doing what?

> Well that could be the problem - I'm 99.99% positive we do not execute
> so many transactions - may be 30-40 thousand per day and that's only for
> one relatively busy DB. The example database ("xxx1") you've chosen is
> not used at all. As most of the other databases in that cluster.

It doesn't matter --- all the age counts are going up at exactly the
same rate.  Somewhere you've got a client that's issuing dummy
transaction starts at an insane rate.  Some attention to "top" output
on the server might help identify it ;-)

            regards, tom lane

Re: Too frequent warnings for wraparound failure

От
Michael Fuhr
Дата:
On Fri, Mar 11, 2005 at 12:06:16PM +0200, Milen A. Radev wrote:

> Tom Lane wrote:
>
> > So you managed to execute about 6000 transactions in those two seconds.
> > Doing what?
>
> Well that could be the problem - I'm 99.99% positive we do not execute
> so many transactions - may be 30-40 thousand per day and that's only for
> one relatively busy DB.

One of your earlier messages said 30-40 thousand *inserts*.  Is the
database also receiving a lot of queries?  Selects increment the
transaction counter as well if they're not part of an explicit
transaction (i.e., one started with BEGIN or its equivalent) -- in
other words if you're in the so-called "autocommit" mode.

> The example database ("xxx1") you've chosen is not used at all. As
> most of the other databases in that cluster.

The transaction counter is global, so "the number of transactions
from the cutoff XID to the current transaction's XID" increments
for every database as the counter increments.

> And another observation (closely related IMO): when I login on another
> development server/cluster and execute the same query I get an increment
>  for every database for every query execution. But there the version is
> 8.0.1.

Right -- see above about selects.  If you're in autocommit mode
then each statement is a separate transaction.

If you really are executing that many transactions, then you might
need to vacuum more than once a day to avoid the warnings.  Another
possibility would be to see if multiple SELECT statements can be
wrapped in a single transaction.  For example, if one application
query requires many database queries, then you might be able to do
them all in one transaction.  Sometimes it's desirable to do so for
other reasons:  if you use the SERIALIZABLE isolation level, for
example, then all queries in a transaction will see the same snapshot
of the database.  That is, two SELECTs will see the same thing even
if another transaction commits changes in between.  That isolation
can be an important part of producing output that doesn't contain
inconsistencies.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Too frequent warnings for wraparound failure

От
"Milen A. Radev"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Fuhr wrote:
> On Fri, Mar 11, 2005 at 12:06:16PM +0200, Milen A. Radev wrote:
>
>
>>Tom Lane wrote:
>>
>>
>>>So you managed to execute about 6000 transactions in those two seconds.
>>>Doing what?
>>
>>Well that could be the problem - I'm 99.99% positive we do not execute
>>so many transactions - may be 30-40 thousand per day and that's only for
>>one relatively busy DB.
>
>
> One of your earlier messages said 30-40 thousand *inserts*.  Is the
> database also receiving a lot of queries?  Selects increment the
> transaction counter as well if they're not part of an explicit
> transaction (i.e., one started with BEGIN or its equivalent) -- in
> other words if you're in the so-called "autocommit" mode.
>
>
>>The example database ("xxx1") you've chosen is not used at all. As
>>most of the other databases in that cluster.
>
>
> The transaction counter is global, so "the number of transactions
> from the cutoff XID to the current transaction's XID" increments
> for every database as the counter increments.
>
>
>>And another observation (closely related IMO): when I login on another
>>development server/cluster and execute the same query I get an increment
>> for every database for every query execution. But there the version is
>>8.0.1.
>
>
> Right -- see above about selects.  If you're in autocommit mode
> then each statement is a separate transaction.
>
> If you really are executing that many transactions, then you might
> need to vacuum more than once a day to avoid the warnings.  Another
> possibility would be to see if multiple SELECT statements can be
> wrapped in a single transaction.  For example, if one application
> query requires many database queries, then you might be able to do
> them all in one transaction.  Sometimes it's desirable to do so for
> other reasons:  if you use the SERIALIZABLE isolation level, for
> example, then all queries in a transaction will see the same snapshot
> of the database.  That is, two SELECTs will see the same thing even
> if another transaction commits changes in between.  That isolation
> can be an important part of producing output that doesn't contain
> inconsistencies.
>


Of course you were both right - we found one small brain-dead perl
script that "executed an infinite loop in under 2 seconds" and in every
loop "pings" the DB using an empty query (well that's the implementation
in the perl pgsql driver). The responsible were punished.

- --
Milen A. Radev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCMxoLqGlhYx0/sboRAgLPAJ9/1+ZWhTexWDwvnB1fQe63oczUSQCgnzwQ
m6K1nSwKu+c2QfPKRTnccSM=
=shk1
-----END PGP SIGNATURE-----

Re: Too frequent warnings for wraparound failure

От
Geoffrey
Дата:
Milen A. Radev wrote:

> Of course you were both right - we found one small brain-dead perl
> script that "executed an infinite loop in under 2 seconds" and in every
> loop "pings" the DB using an empty query (well that's the implementation
> in the perl pgsql driver). The responsible were punished.

So, where'd you hide the body?

--
Until later, Geoffrey