WAL archiving disaster

Поиск
Список
Период
Сортировка
От Mike Atkin
Тема WAL archiving disaster
Дата
Msg-id BANLkTimqdfk3RxWUAUmqhWtDd+71gDx2gg@mail.gmail.com
обсуждение исходный текст
Ответы Re: WAL archiving disaster  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-admin
Hi all,

One of my postgres 8.2 databases had been getting critically low on
disk space and with no extra suitable hardware available to
accommodate it I decided I would use amazon S3 to store the blob data
which was taking up most of the room.  The plan was that I could then
do a vacuumlo and then a full vacuum on the database and reclaim the
space.  The S3 part went fine but when I came to run the vacuumlo the
disk was so full that it didn't have space for the temporary tables
that created.

The database was so big that pg_dumping onto another machine would
take an age so I decided I could get around this by using the
continuous WAL backup to restore the cluster onto my desktop which had
plenty of room and was local to the backup.  So off I went - restored,
vacuumed, dumped, verified, uploaded, dropped the original and finally
restored again where it came from only much smaller.  Fantastic.

This is the point where I discovered that after years of stirling
service my WAL archiving script had failed to transport a single
segment during a brief ISP outage a week ago despite having coped with
these before, and had resulted in the restore terminating prematurely
leaving me minus the last week's worth of data.  In my rush I had
failed to properly check that the last restored segment matched the
last one in the archive and I had now dropped the original db so
performing another base backup would get me nowhere ;-(

So... normally this would be curtains for the database but the missing
segment is from a 4 hour period of practically zero activity.  It was
only archived because it hit the archive_timeout.  Is there any way I
can force postgres to ignore this segment and attempt to redo the rest
of the logs.  I realise this may leave the database terminally
inconsistent but my focus is on retrieving as much of the data as
possible for re-entry rather than getting a functional system out the
other end.

The other option is parsing the rest of the logs to produce a list of
human readable changes that could then be manually entered back into
the week-old system.  I've seen various references to tools that
attempted this in the past but are now unmaintained.  Is anyone aware
of any current tools or of the amount of work required to implement
such a tool?

Any help gratefully received,

Mike

В списке pgsql-admin по дате отправления:

Предыдущее
От: Gabriele Bartolini
Дата:
Сообщение: Re: what is a serial_fkey data type?
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Missing documentation for error code: 80S01