Re: Using pg_start_backup() and pg_stop_backup()

Поиск
Список
Период
Сортировка
От David B Harris
Тема Re: Using pg_start_backup() and pg_stop_backup()
Дата
Msg-id 20130717012109.GK24198@valiant.tachsys.net
обсуждение исходный текст
Ответ на Re: Using pg_start_backup() and pg_stop_backup()  (David B Harris <dbharris@eelf.ddts.net>)
Ответы Re: Using pg_start_backup() and pg_stop_backup()  (John R Pierce <pierce@hogranch.com>)
Список pgsql-general
Yep, my tests were broken :)

Thanks :)

Though it's still possible that the backup would be fine if
wal_keep_segments is high enough - might be good if section 24.3
explicitly mentioned as much.

Actually though (if any PostgreSQL developers are paying attention), it
might be useful to have a new WAL segment-managing behaviour. With the
advent of the replication functionality (which is amazing stuff, thanks
so much), I'd expect fewer and fewer installations to use WAL archiving.
If WAL archiving is disabled, it might make sense for pg_start_backup()
to postpone the deletion of WAL segments until pg_stop_backup().

(Hm I might file that as a feature request. Won't really help me today,
but it would have made this backup stuff I'm doing about half as complex
as it's going to have to be.)

Thanks again,

David

On Tue Jul 16, 08:35pm -0400, David B Harris wrote:
> On Wed Jul 17, 09:16am +0900, Michael Paquier wrote:
> > On Wed, Jul 17, 2013 at 8:24 AM, David B Harris <dbharris@eelf.ddts.net> wrote:
> > > On Wed Jul 17, 08:12am +0900, Michael Paquier wrote:
> > >      4. In parallel, WAL archiving has copied all the logs from the
> > >         start of pg_start_backup() through to the end of pg_stop_backup()
> > Yes exactly. This avoids to have to maintain your own scripts and rely
> > on the server features... What do you actually do to copy the
> > necessary WAL files. Do you fetch them directly from the master's
> > pg_xlog folder?
>
> Due to bandwidth constraints separate scripts are going to need to be
> maintained anyways. I'd love to use pg_basebackup, but since it
> transfers the entire cluster each time, it's not suitable - something
> like rsync is going to be used instead. Actually the use of
> pg_basebackup would require a separate script anyway, so regardless. If
> the WAL files do need to be copied, I'll be copying them separately
> yeah.
>
> (I'll use archive_command that only runs when backup_label is in place
> [pg_startup() has been called], put them somewhere safe during the
> cluster rsync [probably outside of the cluster directory], transfer them
> after pg_stop_backup(), and then delete them on the source/master
> server.)
>
> > > A configuration like this is provided like this in the documentation
> > > (again section 24.3), though some assembly is required. I'm hoping to
> > > avoid it since it'll be introducing a second channel to the backup which
> > > I can almost guarantee will be forgotten in time. (BTW, we tend to
> > > consider the lifespan of our installations in terms of decades.)
> > >
> > > I mean, if it's necessary, the so be it of course, but the question
> > > still remains: I've tested a fair bit under fairly adverse conditions
> > > and not had a single failure. Luck?
> >
> > It looks so, and wal_keep_segments is set to a value high enough on
> > the master side so as all the necessary WAL files are kept intact
> > somewhere even if they are not archived.
>
> I'm going to re-run my tests again, but here's what I did:
>
>     1. Set wal_keep_segments to a small value (I think I used 8)
>     2. Create a new database with a small canary table
>     2. pg_start_backup()
>     3. Run a bunch of database drops/loads and table drops/loads (with
>        pg_restore and COPY .. FROM, respectively) ... for hours and
>        hours (writing tens of GBs of data, without question cycling
>        through wal_keep_segments)
>     4. Take a tarball of the cluster (including pg_xlog/)
>     5. pg_stop_backup()
>     6. Restore from tarball, start cluster
>     7. Check for canary table (successfully)
>
> Separately, I also prototyped the backup and I've run it dozens of times
> occasionally with a heavy write load on the source/master server without
> errors. Now the write load was likely never enough to saturate all
> allocatable WAL segments, but given the documentation (which says that
> the _last_ WAL segment which is synced by pg_stop_backup() must be
> copied as well), I'd have expected a consistency error or similar.
>
> Since somebody has now said "no you're crazy" (paraphrased :), I'm going
> to double-check again.
>
> I didn't mention in the original email, but of course one possibility is
> that the documentation might be incomplete simply in that if
> wal_keep_segments is set high enough such that the earliest segment
> isn't overwritten while the rsync is running, everything would be fine.
> But (again if I'm reading the documentation correctly), I think there
> should be consistency errors (again unless I'm just lucky). Maybe if
> there's an incomplete WAL transaction/segment/whatever, it just silently
> ignores it. (This would of course make sense.)
>
> I'll run the tests again and this time not use a canary table. It occurs
> to me that what I might be seeing is data files in the cluster not being
> deleted. If I'm dropping databases and/or tables, my base backup may be
> getting the old ones but the new ones might be inconsistent/broken.
> (Though I'd still expect a "cannot replay log" error of some sort at
> cluster startup.)

--
     Arguing with an engineer is like wrestling with a pig in mud.
           After a while, you realise the pig is enjoying it.

                   OpenPGP v4 key ID: 4096R/59DDCB9F
    Fingerprint: CC53 F124 35C0 7BC2 58FE  7A3C 157D DFD9 59DD CB9F
                     Retrieve from subkeys.pgp.net


В списке pgsql-general по дате отправления:

Предыдущее
От: David B Harris
Дата:
Сообщение: Re: Using pg_start_backup() and pg_stop_backup()
Следующее
От: John R Pierce
Дата:
Сообщение: Re: Using pg_start_backup() and pg_stop_backup()