Обсуждение: PG_XLOG 27028 files running out of space

Поиск
Список
Период
Сортировка

PG_XLOG 27028 files running out of space

От
Tory M Blue
Дата:
My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.

What are my options?

Thanks

Postgres 9.1.6
slon 2.1.2

Tory

Re: PG_XLOG 27028 files running out of space

От
Ian Lawrence Barwick
Дата:
2013/2/14 Tory M Blue <tmblue@gmail.com>
My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.

What are my options?

Thanks

Postgres 9.1.6
slon 2.1.2

I can't give any advice right now, but I'd suggest posting more details of your
setup, including as much of your postgresql.conf file as possible  (especially
the checkpoint_* and archive_* settings) and also the output of pg_controldata.

Ian Barwick

Re: PG_XLOG 27028 files running out of space

От
Tory M Blue
Дата:


On Thu, Feb 14, 2013 at 3:01 AM, Ian Lawrence Barwick <barwick@gmail.com> wrote:
2013/2/14 Tory M Blue <tmblue@gmail.com>
My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.

What are my options?

Thanks

Postgres 9.1.6
slon 2.1.2

I can't give any advice right now, but I'd suggest posting more details of your
setup, including as much of your postgresql.conf file as possible  (especially
the checkpoint_* and archive_* settings) and also the output of pg_controldata.

Ian Barwick

Thanks Ian

I figured it out and figured out a way around it for now.

My archive destination had it's ownership changed and thus the archive command could not write to the directory. I didn't catch this until well it was too late. So 225GB, 27000 files later.

I found a few writeups on how to clear this up and use the command true in the archive command to quickly and easily delete a bunch of wal files from the pg_xlog directory in short order. So that worked and now since I know what the cause was, I should be able to restore my pg_archive PITR configs and be good to go.

This is definitely one of those bullets I would rather not of  taken, but the damage appears to be minimal (thank you postgres)

Thanks again
Tory

Re: PG_XLOG 27028 files running out of space

От
Heikki Linnakangas
Дата:
On 14.02.2013 12:49, Tory M Blue wrote:
> My postgres db ran out of space. I have 27028 files in the pg_xlog
> directory. I'm unclear what happened this has been running flawless for
> years. I do have archiving turned on and run an archive command every 10
> minutes.
>
> I'm not sure how to go about cleaning this up, I got the DB back up, but
> I've only got 6gb free on this drive and it's going to blow up, if I can't
> relieve some of the stress from this directory over 220gb.
>
> What are my options?

You'll need to delete some of the oldest xlog files to release disk
space. But first you need to make sure you don't delete any files that
are still needed, and what got you into this situation in the first place.

You say that you "run an archive command every 10 minutes". What do you
mean by that? archive_command specified in postgresql.conf is executed
automatically by the system, so you don't need to and should not run
that manually. After archive_command has run successfully, and the
system doesn't need the WAL file for recovery anymore (ie. after the
next checkpoint), the system will delete the archived file to release
disk space. Clearly that hasn't been working in your system for some
reason. If archive_command doesn't succeed, ie. it returns a non-zero
return code, the system will keep retrying forever until it succeeds,
without deleting the file. Have you checked the logs for any
archive_command errors?

To get out of the immediate trouble, run "pg_controldata", and make note
of this line:

Latest checkpoint's REDO WAL file:    000000010000000000000001

Anything older than that file is not needed for recovery. You can delete
those, if you have them safely archived.

- Heikki


Re: PG_XLOG 27028 files running out of space

От
Albe Laurenz
Дата:
Tory M Blue wrote:
> My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what
> happened this has been running flawless for years. I do have archiving turned on and run an archive
> command every 10 minutes.
>
> I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on
> this drive and it's going to blow up, if I can't relieve some of the stress from this directory over
> 220gb.

> Postgres 9.1.6
> slon 2.1.2

Are there any messages in the log file?
Are you sure that archiving works, i.e. do WAL files
show up in your archive location?

The most likely explanation for what you observe is that
archive_command returns a non-zero result (fails).
That would lead to a message in the log.

Yours,
Laurenz Albe


Re: PG_XLOG 27028 files running out of space

От
Tory M Blue
Дата:


On Thu, Feb 14, 2013 at 3:08 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 14.02.2013 12:49, Tory M Blue wrote:
My postgres db ran out of space. I have 27028 files in the pg_xlog
directory. I'm unclear what happened this has been running flawless for
years. I do have archiving turned on and run an archive command every 10
minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but
I've only got 6gb free on this drive and it's going to blow up, if I can't
relieve some of the stress from this directory over 220gb.

What are my options?

You'll need to delete some of the oldest xlog files to release disk space. But first you need to make sure you don't delete any files that are still needed, and what got you into this situation in the first place.

You say that you "run an archive command every 10 minutes". What do you mean by that? archive_command specified in postgresql.conf is executed automatically by the system, so you don't need to and should not run that manually. After archive_command has run successfully, and the system doesn't need the WAL file for recovery anymore (ie. after the next checkpoint), the system will delete the archived file to release disk space. Clearly that hasn't been working in your system for some reason. If archive_command doesn't succeed, ie. it returns a non-zero return code, the system will keep retrying forever until it succeeds, without deleting the file. Have you checked the logs for any archive_command errors?

To get out of the immediate trouble, run "pg_controldata", and make note of this line:

Latest checkpoint's REDO WAL file:    000000010000000000000001

Anything older than that file is not needed for recovery. You can delete those, if you have them safely archived.

- Heikki

Thanks  Heikki,

Yes I misspoke with the archive command, sorry, that was a timeout and in my haste/disorientation I misread/spoke. So I'm clear on that.

I'm also over my issue after discovering the problem, but pg_controldata is something I could of used initially in my panic, so I've added that command to my toolbox and appreciate the response!

Thanks
Tory