Обсуждение: Backing up postgresql database

Поиск
Список
Период
Сортировка

Backing up postgresql database

От
Jakov Sosic
Дата:
Hi!

I have a problem with WAL files... I have a fairly busy server, with
lots of transactions every second. Database is backend for Zabbix
monitoring application, and it writes performance counters of 500+
hosts every few seconds. DB is approx 150GB.

I'm trying to implement "Continuous Archiving" backup solution, but I
have a problem. In 1 hour of copying WAL files to another hard drive,
PostgreSQL generated 304 WAL files of 16 MB's. I mean that is really
apsurd. Here is my config related to WAL:

fsync = on
synchronous_commit = on
wal_sync_method = fsync
wal_buffers = 512kB
wal_writer_delay = 3000ms
commit_delay = 10
commit_siblings = 5
checkpoint_segments = 32
checkpoint_warning = 300s
archive_mode = on
archive_command = 'test ! -f /data/backup/WAL/%f && cp
%p /data/backup/WAL/%f'



This is absolutely unacceptable to backup this volume of information.
Quick calculations gives approx 292 GB of data per day, on a database
that has only 150GB ondisk.

Where am I making mistake? Why is PostgreSQL generating so much WAL's?

I'm using 8.3.4 from official PostgreSQL yum repository on CentOS 5.2.
I've looked for pg_clearxlogtail program, as suggested in some threads
on this list, but I can't find that program on my hard disk, and google
isn't helping either.

If you need more information, I will post it.

Thank you.

--
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |

Re: Backing up postgresql database

От
"Kevin Grittner"
Дата:
>>> Jakov Sosic <jakov.sosic@srce.hr> wrote:
> I have a problem with WAL files... I have a fairly busy server, with
> lots of transactions every second. Database is backend for Zabbix
> monitoring application, and it writes performance counters of 500+
> hosts every few seconds.
>
> I'm trying to implement "Continuous Archiving" backup solution, but
> I have a problem. In 1 hour of copying WAL files to another hard
> drive, PostgreSQL generated 304 WAL files of 16 MB's.

What you are copying is all the information required to restore the
database to the state it was in after the commit of any one of these
transactions.  Out of curiosity, how much space would you have thought
that would take?

> This is absolutely unacceptable to backup this volume of
> information.

Then you need to choose a different backup strategy.  You probably
don't need to keep a lot of these for a long time.  We keep enough to
restore from the latest two base backups, plus the minimum set
required to restore a "snapshot" for the first base backup of each
month.

> Quick calculations gives approx 292 GB of data per day, on a
> database that has only 150GB ondisk.

The WAL file volume will have nothing to do with the size of the
database; rather it reflects all activity which modified the database
over some period of time.

> Where am I making mistake? Why is PostgreSQL generating so much
> WAL's?

It records each change made to every row in the database.  Do fewer
updates or keep the WAL files for less time?

> I've looked for pg_clearxlogtail program, as suggested in some
> threads on this list, but I can't find that program on my hard disk,
> and google isn't helping either.

It's on pgfoundry.org, but no need to go after it.  What it's designed
to help with is improving the compression of WAL files which are sent
before they're full.  It sounds like yours are full.  You should
probably be compressing the WAL files.  They compress down to about 4
MB to 8 MB each when full if you pipe them through gzip.

I hope this helps.

-Kevin

Re: Backing up postgresql database

От
Jakov Sosic
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kevin Grittner wrote:

> What you are copying is all the information required to restore the
> database to the state it was in after the commit of any one of these
> transactions.  Out of curiosity, how much space would you have thought
> that would take?

Well I surely didn't thought it would take 4.8 TB per day on a 150GB
database! I tought it will take less than a DB itself. Wouldn't it be
better to simply store SQL commands? :)


> Then you need to choose a different backup strategy.  You probably
> don't need to keep a lot of these for a long time.  We keep enough to
> restore from the latest two base backups, plus the minimum set
> required to restore a "snapshot" for the first base backup of each
> month.

Yes, but if base backup is every 7 days, that means 40 TB of backup for
two weeks.
Please, don't get angry because of my attitude, but I'm new to backup
strategies and that's why I'm puzzled alot with the volume of
information... I mean, we do have LTO4 tapes, but still :)


> The WAL file volume will have nothing to do with the size of the
> database; rather it reflects all activity which modified the database
> over some period of time.

As I said, almost 500 hosts sends about 25'000 values every few
minutes... So that's a lot of insert/updates...


> It records each change made to every row in the database.  Do fewer
> updates or keep the WAL files for less time?

How do you mean, do fewer updates? DB transactions can't be influenced -
cause is on the application level, not on the DB level.

And what do you mean by keeping WAL's for less time? Daily volume is
still the same... I can define that I'll keep only two week backup, and
not 90day as is policy in my company, but as I've mentioned, that's
still a lots lots of data :) I thought that WAL's will be a lot smaller
than the whole DB cause only small part of DB information is inserted on
daily basis in comparison to the volume of data gathered through months
of db running. Obviously - as you say - I was completely mislead in my
assumptions...


> It's on pgfoundry.org, but no need to go after it.  What it's designed
> to help with is improving the compression of WAL files which are sent
> before they're full.  It sounds like yours are full.  You should
> probably be compressing the WAL files.  They compress down to about 4
> MB to 8 MB each when full if you pipe them through gzip.

I've read about gziping WAL's, and I will do it offcourse, but that only
makes problem a little smaller, doesn't solve it :)



- --
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAkmtwgQACgkQMfwi35Zcri1KqACfXDffbxvQJzzyvc9rbN7arbuf
7HwAn3lrvbE/BM+y2Uy0I48VtlG1AMUd
=v6+m
-----END PGP SIGNATURE-----

Re: Backing up postgresql database

От
Scott Marlowe
Дата:
On Tue, Mar 3, 2009 at 4:49 PM, Jakov Sosic <jakov.sosic@srce.hr> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Kevin Grittner wrote:
>
>> What you are copying is all the information required to restore the
>> database to the state it was in after the commit of any one of these
>> transactions.  Out of curiosity, how much space would you have thought
>> that would take?
>
> Well I surely didn't thought it would take 4.8 TB per day on a 150GB
> database! I tought it will take less than a DB itself. Wouldn't it be
> better to simply store SQL commands? :)

That might not be much smaller if you're doing a lot of updates /
inserts / deletes.  Also, SQL ain't the same as what was committed. If
you insert now() it won't be the same when you run it in two days.

The key here is it seems like you have a very high churn rate, and if
you're backup solution involves saving all those changes and applying
them later, it's gonna incur a rather large storage use.

I.e. PITR, as you have it implemented, might not be your best solution.

I do believe there's a warm standby method where you are constantly in
recovery on the target database but I'm not sure it would work right
here.

You might be better off with something like Slony, Burcado (sp?) or
another of other live replication setups that don't have the same
issues you're seeing here.

> Please, don't get angry because of my attitude, but I'm new to backup
> strategies and that's why I'm puzzled alot with the volume of
> information... I mean, we do have LTO4 tapes, but still :)

I don't think anyone's getting angry.  :)

>> It records each change made to every row in the database.  Do fewer
>> updates or keep the WAL files for less time?
>
> How do you mean, do fewer updates? DB transactions can't be influenced -
> cause is on the application level, not on the DB level.

I think you missed his point.  If you can't do fewer updates, then
your PITR implementation isn't going to work unless you've got 40TB to
dedicate to the job.

> And what do you mean by keeping WAL's for less time? Daily volume is
> still the same... I can define that I'll keep only two week backup, and

Reduce the time for keeping old log files to one or two days and do
nightly rsyncs?

> not 90day as is policy in my company, but as I've mentioned, that's
> still a lots lots of data :)

It's probably better to keep pg_dump type backups for those types of
things.  pitr is a poor challenger to the easily compressed output of
pg_dump for offsite backups.  Especially over a wire.

> I thought that WAL's will be a lot smaller
> than the whole DB cause only small part of DB information is inserted on

If I update a single row in a database 1,000,000,000 times, how man
WAL files will you need?
If you update every row in a 1,000,000,000 row database, how many WAL
files will you need?

> I've read about gziping WAL's, and I will do it offcourse, but that only
> makes problem a little smaller, doesn't solve it :)

I'm wondering what you're objectives are in your backup strategy, it
might be you're coming at it one way when we'd all approach it from
another way.  Or not.

Re: Backing up postgresql database

От
Jakov Sosic
Дата:
Scott Marlowe wrote:

> That might not be much smaller if you're doing a lot of updates /
> inserts / deletes.  Also, SQL ain't the same as what was committed. If
> you insert now() it won't be the same when you run it in two days.

Well I think I miscalculated a little bit... It's more like a 120 GB per
day, not 5TB... I'll post correct results tommorow.


> The key here is it seems like you have a very high churn rate, and if
> you're backup solution involves saving all those changes and applying
> them later, it's gonna incur a rather large storage use.
>
> I.e. PITR, as you have it implemented, might not be your best solution.



> You might be better off with something like Slony, Burcado (sp?) or
> another of other live replication setups that don't have the same
> issues you're seeing here.

Replication involves another machine, and more load on current one, and
that's not what I want.


> It's probably better to keep pg_dump type backups for those types of
> things.  pitr is a poor challenger to the easily compressed output of
> pg_dump for offsite backups.  Especially over a wire.

But how long would pg_dump execute on DB with this volume of traffic?
And would it kill the machine in that period? I can try that as well
tommorow...


> If I update a single row in a database 1,000,000,000 times, how man
> WAL files will you need?
> If you update every row in a 1,000,000,000 row database, how many WAL
> files will you need?

Obviously application is not just appending new data, as I thought
before... Because appends (aka INSERT) wouldn't cause this volume of
WAL's or would it?


> I'm wondering what you're objectives are in your backup strategy, it
> might be you're coming at it one way when we'd all approach it from
> another way.  Or not.

You are correct on this one. DB I'm talking about is not that important
to have rigorous backup schedule and recovery policies. I rather have
hardware limitations and I have to find the best solution to not
overutilize the hardware and in the same time have as frequent backups
as possible.



--
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |

Re: Backing up postgresql database

От
Scott Marlowe
Дата:
On Tue, Mar 3, 2009 at 5:51 PM, Jakov Sosic <jakov.sosic@srce.hr> wrote:
> Scott Marlowe wrote:
>
>> That might not be much smaller if you're doing a lot of updates /
>> inserts / deletes.  Also, SQL ain't the same as what was committed. If
>> you insert now() it won't be the same when you run it in two days.
>
> Well I think I miscalculated a little bit... It's more like a 120 GB per
> day, not 5TB... I'll post correct results tommorow.

Oh, that'll make a big big difference.  That's like 1/40th the amount
of data you thought you had.   At that rate you can keep a weeks data
on a 1TB drive.

>> The key here is it seems like you have a very high churn rate, and if
>> you're backup solution involves saving all those changes and applying
>> them later, it's gonna incur a rather large storage use.
>>
>> I.e. PITR, as you have it implemented, might not be your best solution.
>
>> You might be better off with something like Slony, Burcado (sp?) or
>> another of other live replication setups that don't have the same
>> issues you're seeing here.
>
> Replication involves another machine, and more load on current one, and
> that's not what I want.

What you're doing right now is replication, so I'd expect it involve
another machine, as if this one catches fire or the ps burns out all
the hard drives or something, you'll need the LOGs on another machine
to ensure you can bring your db back up.  As long as the replication
is asynchronous and recovers well from falling behind the secondary
machine need not be a monster.  I would assume you're db usage has
some kind of pattern that is lower at some point than at others, but
if it's always inserting 24/7 at high rates then yeah, you'd need a
secondary machine as fast as your primary to keep up.

>> It's probably better to keep pg_dump type backups for those types of
>> things.  pitr is a poor challenger to the easily compressed output of
>> pg_dump for offsite backups.  Especially over a wire.
>
> But how long would pg_dump execute on DB with this volume of traffic?
> And would it kill the machine in that period? I can try that as well
> tommorow...

We run pg_dump during lower volume hours, but it generally doesn't
kill the machine.  OTOH, there's only so much IO in a server, and if
you're using nearly all of yours then pg_dump can easily make the
machine bog down.  Always dump to a separate array (and machine) from
the one holding the DB if you can.

Note you can also use a recovered PITR machine to make a backup.  If
you had them alternating you could take nightly backups and not affect
your primary db.  But that depends on how long your rsyncs take.
Given a fast RAID array on the db server, and gigE, you should be able
to run a backup off hour in the several hundreds of megs a second
range.

>> If I update a single row in a database 1,000,000,000 times, how man
>> WAL files will you need?
>> If you update every row in a 1,000,000,000 row database, how many WAL
>> files will you need?
>
> Obviously application is not just appending new data, as I thought
> before... Because appends (aka INSERT) wouldn't cause this volume of
> WAL's or would it?

Sure it would, but it would also cause your main db on disk storage to
grow as well.  In pgsql update = delete / insert.

>> I'm wondering what you're objectives are in your backup strategy, it
>> might be you're coming at it one way when we'd all approach it from
>> another way.  Or not.
>
> You are correct on this one. DB I'm talking about is not that important
> to have rigorous backup schedule and recovery policies. I rather have
> hardware limitations and I have to find the best solution to not
> overutilize the hardware and in the same time have as frequent backups
> as possible.

Well, one of the objectives of PITR is to have as close to up to the
second replication as possible with minimal overhead.  After all, the
WAL files have to be created anyway, so we may as well use them.
Understand on the hardware.  It's one thing to be asked to stretch the
hardware budget, but it can only stretch so far.  Make sure what your
objectives are in the project, and see if they're realistic in light
of your hardware restrictions.  You say the DB isn't important, but
how expensive would the data on it be to replace if you had not
backups, or week old backups, or month or backups etc.  Data has a
value, and if you save $2,000 on a server and lose $25,000 worth of
data, someone somewhere made a bad choice.

Re: Backing up postgresql database

От
"Kevin Grittner"
Дата:
>>> Jakov Sosic <jakov.sosic@srce.hr> wrote:
> How do you mean, do fewer updates?

Well, without knowing the application it's hard for me to say; but, as
a guess, perhaps the application could be modified to accumulate, say,
a minute's worth of data and update it in summary, rather than
updating each sample once per second.  That would leave you vulnerable
to losing up to a minute's worth of data, but would cut the rate of
WAL generation to less than 2% of its current rate.

> And what do you mean by keeping WAL's for less time?

Do base backups more frequently.  If you need to keep more than two
base backup's worth, only keep "snapshots" of older backups -- just
the base and enough WAL files to cover the range specified in the
*.backup file generated by the start and stop functions run during the
base backup.

> I've read about gziping WAL's, and I will do it offcourse, but that
> only makes problem a little smaller, doesn't solve it :)

Well, cutting the scope of a problem by two thirds is sometimes
enough.  If you combine that with the summary updates mentioned above,
you might reduce the space needed to store WAL files to less than 1%
of current requirements.  Judicious use of the snapshot technique
could let you keep a few snapshots going back 90 days and only require
a small fraction of 1% of what you're thinking it will take based on
current numbers.

-Kevin