Обсуждение: recovery from xid wraparound

Поиск
Список
Период
Сортировка

recovery from xid wraparound

От
"Shane Wright"
Дата:
Hi

I'm running 7.4 on RHAS 4, and I think I've had a transaction id
wraparound issue.  Running the command below gives the suitably
worrying negative number:

emystery=# SELECT datname, age(datfrozenxid) FROM pg_database;
     datname      |     age
------------------+-------------
 [maindbname]         | -2081610471
 [otherdbname]      |  1075601025
 [otherdbname] |  1257289757
 [otherdbname]  |  1074582099
 [otherdbname]   |  1257289757


Which is weird - because I have vacuumed the database quite a lot -
both individual tables and I thought a vacuum of the whole database a
month or so ago.

Anyway - not noticed any data loss yet and was hoping it would be such
that if all tables had been vacuumed recently (including system catalog
tables), that there would be no remaining rows that would appear to
have a future xid and so the database should be ok?

Obviously I'm now doing the write thing with a vacuumdb -a - however
this has been running 9 hours  now and looks like at least 7 hours to
go just on this one monstrous table

in the interests of risk reduction I've just knocked up a script to run
ahead and quickl vacuum all the other tables.

But my questions are thus...

a) is my assumption about the database being ok correct - assuming all
tables have been vacuumed recently, including catalog tables?

b) is it possible to safely abort my whole table vacuum now so I can
run it at the weekend when there's less traffic?

c) if I have experienced data loss, on the assumption all the table
structure remains (looks like it does), and I have a working backup
from before the xid wraparound (I do), can I just reinsert any
detected-missing data at the application level without needing a
dump/reload?

Any help appreciated in this really not-fun time,

thanks

S


Re: recovery from xid wraparound

От
"Shane Wright"
Дата:
Incidentally, how many passes of a table can vacuum make!  Its currently
on its third trip through the 20Gb of indices, meaning another 7 hours
till completion [of this table]!.

Assume it only does three passes?  (it chooses based on the table
continuing to be updated while vacuum is running)

S



-----Original Message-----
From: Martijn van Oosterhout [mailto:kleptog@svana.org]
Sent: 24 October 2006 10:24
To: Shane Wright
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] recovery from xid wraparound


On Tue, Oct 24, 2006 at 07:43:15AM +0100, Shane Wright wrote:
> Anyway - not noticed any data loss yet and was hoping it would be such

> that if all tables had been vacuumed recently (including system
> catalog tables), that there would be no remaining rows that would
> appear to have a future xid and so the database should be ok?

Running vacuum is the right solution, but I think you have to let it
finish. In particular, in that version a database-wide vacuum has to
complete before it will update the datfrozenxid (it's not tracked per
table).

> a) is my assumption about the database being ok correct - assuming all

> tables have been vacuumed recently, including catalog tables?

Should be ok, but apparently you missed one, or didn't do a database
wide vacuum.

> b) is it possible to safely abort my whole table vacuum now so I can
> run it at the weekend when there's less traffic?

Aborting vacuum is safe, but you have to do a database-wide vacuum at
some point.

> c) if I have experienced data loss, on the assumption all the table
> structure remains (looks like it does), and I have a working backup
> from before the xid wraparound (I do), can I just reinsert any
> detected-missing data at the application level without needing a
> dump/reload?

A VACUUM will recover any data that slipped beyond the horizon less than
1 billion transactions ago, which I think covers you completely. The
only issue is that unique indexes may be confused because new
conflicting data may have been inserted while the old data was
invisible. Only you can say if that's going to be an issue.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability
> to litigate.

Re: recovery from xid wraparound

От
Alvaro Herrera
Дата:
Shane Wright wrote:
>
> Incidentally, how many passes of a table can vacuum make!  Its currently
> on its third trip through the 20Gb of indices, meaning another 7 hours
> till completion [of this table]!.
>
> Assume it only does three passes?  (it chooses based on the table
> continuing to be updated while vacuum is running)

As many passes at it needs to.  It is limited by maintenance_work_mem
(in 7.4 I think it was sort_mem).

It needs to collect an array of tuple pointers, and it keeps them in
memory.  When the array grows to maintenance_work_mem, it stops scanning
the table and scans the indexes, removing everything that points to
those tuple pointers.  Then it goes back to scanning the table.

So the problem is that it scans the whole indexes many times.  If you
increase maintenance_work_mem way up for this vacuum task, it will need
to wholly scan the indexes less times (hopefully only one), making the
process a lot faster.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: recovery from xid wraparound

От
Tom Lane
Дата:
"Shane Wright" <shane.wright@edigitalresearch.com> writes:
> Incidentally, how many passes of a table can vacuum make!

Lots, especially if the table hasn't been vacuumed in a long time...
Perhaps you should be using a higher maintenance_work_mem?
(Um, in 7.4 make that vacuum_mem.)  Larger work memory translates
directly to fewer passes over the indexes.

            regards, tom lane

Re: recovery from xid wraparound

От
Martijn van Oosterhout
Дата:
On Tue, Oct 24, 2006 at 03:47:52PM +0100, Shane Wright wrote:
>
> Incidentally, how many passes of a table can vacuum make!  Its currently
> on its third trip through the 20Gb of indices, meaning another 7 hours
> till completion [of this table]!.
>
> Assume it only does three passes?  (it chooses based on the table
> continuing to be updated while vacuum is running)

It depends on how many tuples it needs to process and how much memory
you gave it (the maintainence_work_mem settings). The more memory you
give it, the less passes it needs to do...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: recovery from xid wraparound

От
"Shane Wright"
Дата:
Aw :(

Its at the default of 8Mb.  The table contains 220 million rows and 6 indices.  It has a few deleted rows...


If I change vacuum_mem I'll need to at least 'pg_ctl reload' - will it apply straightaway with the next vacuum query or
doesit need a full restart? 

Does vacuum_mem need shared memory? (i.e. is it subject to the OS's limit) - have looked in the docs and googled but
can'tsee detail on this 



If I have managed to vacuum all the catalog tables, and my script has ensured all user tables other than this one have
beenvacuumed, then...   will the first pass of vacuum on this have set the xid to FrozenXID for all rows - i.e. is the
tablesafe? 


What's the relative safety of restarting this vacuum with a bigger vacuum_mem, say at the end of the week when traffic
isquieter? 


Basically if its just datfrozenxid that's not updated I can live with delaying the vacuum a few days.  But if things
aremore serious then obviously I can't wait. 

Is it safe to say that if the catalog tables are ok and an individual tables has been vacuumed then its data is safe?


S



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 24 October 2006 15:52
To: Shane Wright
Cc: pgsql-general@postgresql.org; Martijn van Oosterhout
Subject: Re: [GENERAL] recovery from xid wraparound


"Shane Wright" <shane.wright@edigitalresearch.com> writes:
> Incidentally, how many passes of a table can vacuum make!

Lots, especially if the table hasn't been vacuumed in a long time... Perhaps you should be using a higher
maintenance_work_mem?(Um, in 7.4 make that vacuum_mem.)  Larger work memory translates directly to fewer passes over
theindexes. 

            regards, tom lane


Re: recovery from xid wraparound

От
Martijn van Oosterhout
Дата:
On Tue, Oct 24, 2006 at 04:18:09PM +0100, Shane Wright wrote:
> If I change vacuum_mem I'll need to at least 'pg_ctl reload' - will
> it apply straightaway with the next vacuum query or does it need a
> full restart?

You can control it per session I think. So you can start psql and type:

# set vacuum_mem=<huge>;
SET
# VACUUM VERBOSE;
<blah>

You don't have to change the main config, unless you want it to apply
forever. Although, 8MB is small in general so you might want to up it
anyway. But for this one-off vacuum of this large table you could give
a much larger amount of memory.

> Does vacuum_mem need shared memory? (i.e. is it subject to the OS's
> limit) - have looked in the docs and googled but can't see detail on
> this

It's just ordinary memory. If you have a few gig to spare, you can give
it all to the vacuum.

> If I have managed to vacuum all the catalog tables, and my script has
> ensured all user tables other than this one have been vacuumed,
> then...  will the first pass of vacuum on this have set the xid to
> FrozenXID for all rows - i.e. is the table safe?

Pass.

Although I think the point is that it hasn't scanned to whole table yet
because it ran out of memory...

> Is it safe to say that if the catalog tables are ok and an individual
> tables has been vacuumed then its data is safe?

Yes...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: recovery from xid wraparound

От
Tom Lane
Дата:
"Shane Wright" <shane.wright@edigitalresearch.com> writes:
> If I change vacuum_mem I'll need to at least 'pg_ctl reload' - will it apply straightaway with the next vacuum query
ordoes it need a full restart? 

reload is enough.

> Basically if its just datfrozenxid that's not updated I can live with delaying the vacuum a few days.  But if things
aremore serious then obviously I can't wait. 

The question is how close to the wraparound horizon is any of your data.
We don't really know that --- the datfrozenxid provides a lower bound
but we don't know where things are in reality.  Also, are you prepared
to tolerate wrong answers (missing rows) for awhile?  As Martijn
mentioned, the vacuum will retrieve rows that have slid past the wrap
horizon, but they'll have been invisible to your queries meanwhile.

            regards, tom lane