Обсуждение: Database 'template1' vacuum

Поиск
Список
Период
Сортировка

Database 'template1' vacuum

От
Natalie Wenz
Дата:
Hi!

I've encountered something I've never seen before, and I am hoping someone can help me understand what's going on
behindthe scenes with this vacuum/xid behavior. 

ERROR:  database is not accepting commands to avoid wraparound data loss in database "template1"

This error popped up right after I restarted a database that had just finished running a vacuum freeze in single user
modefor the postgres database (the only database we use). 
(That was in response to: ERROR:  database is not accepting commands to avoid wraparound data loss in database
"postgres")

I'm confused about the need to vacuum template1. All of our activity is in the database named postgres. Do all of the
databasesshare the same set of xids? If they do, why would the single user backend vacuum not handle all of the
databasesat once? If all databases don't share the same set of xids, then what could be going on internally to cause
template1to run through so many xids? 

(This is running on 9.5.3, if that makes a difference.)

Thanks!

Natalie

Re: Database 'template1' vacuum

От
Kevin Grittner
Дата:
On Tue, Jul 26, 2016 at 1:08 PM, Natalie Wenz
<nataliewenz@ebureau.com> wrote:

> ERROR:  database is not accepting commands to avoid wraparound
> data loss in database "template1"
>
> This error popped up right after I restarted a database that had
> just finished running a vacuum freeze in single user mode for the
> postgres database (the only database we use).
> (That was in response to: ERROR:  database is not accepting
> commands to avoid wraparound data loss in database "postgres")
>
> I'm confused about the need to vacuum template1. All of our
> activity is in the database named postgres. Do all of the
> databases share the same set of xids?

Yes.

> If they do, why would the single user backend vacuum not handle
> all of the databases at once?

A connection is always to a particular database.  The big question
is why autovacuum didn't keep you out of trouble.  You didn't turn
that off, did you?  If not, please describe your hardware and
workload, and show you vacuum/autovacuum configuration values.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Database 'template1' vacuum

От
Natalie Wenz
Дата:
> On Jul 26, 2016, at 1:20 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
>
> On Tue, Jul 26, 2016 at 1:08 PM, Natalie Wenz
> <nataliewenz@ebureau.com> wrote:
>
>> ERROR:  database is not accepting commands to avoid wraparound
>> data loss in database "template1"
>>
>> This error popped up right after I restarted a database that had
>> just finished running a vacuum freeze in single user mode for the
>> postgres database (the only database we use).
>> (That was in response to: ERROR:  database is not accepting
>> commands to avoid wraparound data loss in database "postgres")
>>
>> I'm confused about the need to vacuum template1. All of our
>> activity is in the database named postgres. Do all of the
>> databases share the same set of xids?
>
> Yes.
>
>> If they do, why would the single user backend vacuum not handle
>> all of the databases at once?
>
> A connection is always to a particular database.  The big question
> is why autovacuum didn't keep you out of trouble.  You didn't turn
> that off, did you?  If not, please describe your hardware and
> workload, and show you vacuum/autovacuum configuration values.

I can see why you would be wondering that. :) Autovacuum is definitely on. This situation might be considered
"maintenancefallout". This database is very large (45 TB, currently, I believe), with most of the space being occupied
byone table. (The database is basically a big transaction log, with a nearly all traffic being inserts. No deletes or
updateson that big table.). I recently migrated it from 9.3 to 9.5, from one Free BSD host to another. As you can
imagine,that took some time. I tried to reduce the time by "dumping" the large table in chunks using "copy" and a date
range,thereby increasing the concurrency for both dumping and restoring that data. The problem is, I forgot that, even
thoughI didn't use very many xids importing the data that way, it still meant that alllllll of that data would need to
befrozen later. Once the data was all migrated and we resumed our regular traffic, we quickly got into trouble with the
xids,since the auto vacuum couldn't freeze all 45 TB of that one big table while fielding up to... 500 million inserts
perday? (typically around 6k inserts/second) So we disabled external connections to allow the database to vacuum,
bufferingthe new transactions elsewhere. Now that the first enormous vacuum freeze is complete, we are "unspooling" the
datathat was buffered while the database vacuumed. Between the hardware and postgres software, we easily see this hit
65thousand inserts per second. (Wooo, modern hardware! Excellent software!)  Of course, that means that we've run out
ofxids again in the span of about 10 hours; no tuning of the auto vacuum that we've tried is able to keep pace with
that.So. I'm currently suffering through a cycle of unspool buffered transactions, vacuum freeze while transactions
buffer,unspool . . .  

At this point, I have just left the auto vacuum defaults in place for this database, and it generally keeps up with our
typical6k insert/second workload. Even so, maintenance is problematic. We are looking into batch inserts in the future,
ortable partitioning (would that help with concurrency for maintenance? dumping, restoring, vacuuming?), or both. We
alsokeep hoping that maybe, just maybe, we'll see a version of postgres with 64-bit xids? :D 

Anyway, template1: this is expected behavior? I'm maybe on my third or fourth round of unspool, then single user vacuum
freeze.The auto vacuum is on and free to run while I'm unspooling. I'm surprised that the auto vacuum wouldn't have
alreadyhave vacuumed that very small database, or that haven't been prompted to vacuum template1 prior to this, if the
onlytransactions in this database would have occurred during the initdb for the postgres database. Does it get used for
otherthings during normal database operations? 



Re: Database 'template1' vacuum

От
Kevin Grittner
Дата:
On Tue, Jul 26, 2016 at 2:36 PM, Natalie Wenz <nataliewenz@ebureau.com> wrote:

> Now that the first enormous vacuum freeze is complete, we are
> "unspooling" the data that was buffered while the database
> vacuumed. Between the hardware and postgres software, we easily see
> this hit 65 thousand inserts per second. (Wooo, modern hardware!
> Excellent software!)  Of course, that means that we've run out of
> xids again in the span of about 10 hours; no tuning of the auto
> vacuum that we've tried is able to keep pace with that. So. I'm
> currently suffering through a cycle of unspool buffered
> transactions, vacuum freeze while transactions buffer, unspool . . .

There are many reasons to batch inserts, this being one of them.
You might see a big further boost in the insert rate, even with
moderate batch sizes.  If you think about it, a batch size of 100
should get you to somewhere around 99% of the benefit of larger
batch sizes.  Of course, you need some way to handle an error in a
row or two out of the batch....

> At this point, I have just left the auto vacuum defaults in place
> for this database, and it generally keeps up with our typical 6k
> insert/second workload.

I would increase autovacuum_max_workers (to maybe 10 or 12) and
autovacuum_vacuum_cost_limit (to several thousand).

> Anyway, template1: this is expected behavior?

Yeah.  I would just connect to that database as the database
superuser (in single-user mode only if necessary), and run the
command `VACUUM`.  No options, just the bare command.

> I'm surprised that the auto vacuum wouldn't have already have
> vacuumed that very small database, or that haven't been prompted to
> vacuum template1 prior to this, if the only transactions in this
> database would have occurred during the initdb for the postgres
> database.

The logic for picking what to vacuum isn't, shall we say, ideal.
It may keep running into the same tables to vacuum whenever it
scans, and never gets to others.  A higher worker count should help
it get to more (hopefully all) of the tables.  The cost limit is
share among all the active workers, so raising the maximum doesn't
tend to have much direct impact on performance.

> Does it get used for other things during normal database
> operations?

No -- it is normally used just for CREATE DATABASE.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Database 'template1' vacuum

От
Natalie Wenz
Дата:
Ok, I have one more question (for now!) about vacuum behavior:

How are static tables handled? Once every row in a table is marked as frozen, and there are no further inserts,
deletes,updates, etc to the table, does the vacuum have to do anything to that table when advancing the relfrozenxid?
Isthere a way that the vacuum can see "Oh, this whole table is frozen, my work here is done!" or does it still have to
scanevery page (or worse, every row)? 


> On Jul 26, 2016, at 2:50 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
>
> On Tue, Jul 26, 2016 at 2:36 PM, Natalie Wenz <nataliewenz@ebureau.com> wrote:
>
>> Now that the first enormous vacuum freeze is complete, we are
>> "unspooling" the data that was buffered while the database
>> vacuumed. Between the hardware and postgres software, we easily see
>> this hit 65 thousand inserts per second. (Wooo, modern hardware!
>> Excellent software!)  Of course, that means that we've run out of
>> xids again in the span of about 10 hours; no tuning of the auto
>> vacuum that we've tried is able to keep pace with that. So. I'm
>> currently suffering through a cycle of unspool buffered
>> transactions, vacuum freeze while transactions buffer, unspool . . .
>
> There are many reasons to batch inserts, this being one of them.
> You might see a big further boost in the insert rate, even with
> moderate batch sizes.  If you think about it, a batch size of 100
> should get you to somewhere around 99% of the benefit of larger
> batch sizes.  Of course, you need some way to handle an error in a
> row or two out of the batch....
>
>> At this point, I have just left the auto vacuum defaults in place
>> for this database, and it generally keeps up with our typical 6k
>> insert/second workload.
>
> I would increase autovacuum_max_workers (to maybe 10 or 12) and
> autovacuum_vacuum_cost_limit (to several thousand).
>
>> Anyway, template1: this is expected behavior?
>
> Yeah.  I would just connect to that database as the database
> superuser (in single-user mode only if necessary), and run the
> command `VACUUM`.  No options, just the bare command.
>
>> I'm surprised that the auto vacuum wouldn't have already have
>> vacuumed that very small database, or that haven't been prompted to
>> vacuum template1 prior to this, if the only transactions in this
>> database would have occurred during the initdb for the postgres
>> database.
>
> The logic for picking what to vacuum isn't, shall we say, ideal.
> It may keep running into the same tables to vacuum whenever it
> scans, and never gets to others.  A higher worker count should help
> it get to more (hopefully all) of the tables.  The cost limit is
> share among all the active workers, so raising the maximum doesn't
> tend to have much direct impact on performance.
>
>> Does it get used for other things during normal database
>> operations?
>
> No -- it is normally used just for CREATE DATABASE.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin



Re: Database 'template1' vacuum

От
"David G. Johnston"
Дата:
On Mon, Aug 8, 2016 at 10:43 AM, Natalie Wenz <nataliewenz@ebureau.com> wrote:
Ok, I have one more question (for now!) about vacuum behavior:

How are static tables handled? Once every row in a table is marked as frozen, and there are no further inserts, deletes, updates, etc to the table, does the vacuum have to do anything to that table when advancing the relfrozenxid? Is there a way that the vacuum can see "Oh, this whole table is frozen, my work here is done!" or does it still have to scan every page (or worse, every row)?


Seems so:



"""
VACUUM normally only scans pages that have been modified since the last vacuum, but relfrozenxid can only be advanced when the whole table is scanned. The whole table is scanned when relfrozenxid is more than vacuum_freeze_table_age transactions old, when VACUUM's FREEZE option is used, or when all pages happen to require vacuuming to remove dead row versions.
"""

David J.

Re: Database 'template1' vacuum

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Aug 8, 2016 at 10:43 AM, Natalie Wenz <nataliewenz@ebureau.com>
> wrote:
>> How are static tables handled? Once every row in a table is marked as
>> frozen, and there are no further inserts, deletes, updates, etc to the
>> table, does the vacuum have to do anything to that table when advancing the
>> relfrozenxid? Is there a way that the vacuum can see "Oh, this whole table
>> is frozen, my work here is done!" or does it still have to scan every page
>> (or worse, every row)?

> Seems so:
> https://wiki.postgresql.org/wiki/VacuumHeadaches

FWIW, this is due to get better in 9.6.  Per release notes:

    Avoid re-vacuuming pages containing only frozen tuples (Masahiko
    Sawada, Robert Haas, Andres Freund)

    Formerly, an anti-wraparound vacuum had to visit every page of a
    table whether or not there was anything to do there. Now, pages
    containing only already-frozen tuples are identified in the
    table's visibility map, and can be skipped by vacuum even when
    it's doing transaction wraparound prevention. This should greatly
    reduce the cost of maintaining large tables containing
    mostly-unchanging data.

            regards, tom lane