Обсуждение: pgBackRest backup from standby

Поиск
Список
Период
Сортировка

pgBackRest backup from standby

От
Don Seiler
Дата:
Evening all.

Looking to use pgBackRest to take a backup from a hot standby. I'm reading that pgBackRest still needs to connect to the primary and copy some files. My questions are:

  1. What files does it need to copy? Config files? WAL files?
  2. How does it connect? SSH?
  3. Does pgBackRest need to be installed and configured on the primary as well?
Thanks,
Don.

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
Michael Paquier
Дата:
On Sun, Feb 18, 2018 at 06:34:46PM -0600, Don Seiler wrote:
> Looking to use pgBackRest to take a backup from a hot standby. I'm reading
> that pgBackRest still needs to connect to the primary and copy some files.
> My questions are:
>
>
>    1. What files does it need to copy? Config files? WAL files?
>    2. How does it connect? SSH?
>    3. Does pgBackRest need to be installed and configured on the primary as
>    well?

I am adding in CC: Stephen Frost and David Steele who work on the took.
You may want to contact the maintainers directly through github where
the project is maintained:
https://github.com/pgbackrest/pgbackrest
--
Michael

Вложения

Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Sun, Feb 18, 2018 at 6:43 PM, Michael Paquier <michael@paquier.xyz> wrote:
I am adding in CC: Stephen Frost and David Steele who work on the took.

I assumed Stephen was already on this list, and I communicate with him regularly on Slack as well but just throwing this out there on a Sunday night.
 
You may want to contact the maintainers directly through github where
the project is maintained:
https://github.com/pgbackrest/pgbackrest

Is that the place to just ask questions? I wasn't sure that "Issues" meant just questions versus problems or bugs. I didn't see any mention of a forum or list on their website, and there have been lots of pgBackRest questions on this list in the past so I settled on this one.

Don. 

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
Michael Paquier
Дата:
On Sun, Feb 18, 2018 at 06:48:30PM -0600, Don Seiler wrote:
> On Sun, Feb 18, 2018 at 6:43 PM, Michael Paquier <michael@paquier.xyz> wrote:
>> You may want to contact the maintainers directly through github where
>> the project is maintained:
>> https://github.com/pgbackrest/pgbackrest
>
> Is that the place to just ask questions? I wasn't sure that "Issues" meant
> just questions versus problems or bugs. I didn't see any mention of a forum
> or list on their website, and there have been lots of pgBackRest questions
> on this list in the past so I settled on this one.

Stephen and David are around all the time, so it is not really an issue
to discuss things related to pgBackRest on this list I guess :)
Attaching related folks directly in CC: also usually helps.

You may get faster feedback by opening an issue directly on github
though, as there are a lot of emails on -general so it is easy to get
things lost.  My 2c.
--
Michael

Вложения

Re: pgBackRest backup from standby

От
David Steele
Дата:
Hi Don,

On 2/18/18 7:34 PM, Don Seiler wrote:
> 
> Looking to use pgBackRest to take a backup from a hot standby. I'm
> reading that pgBackRest still needs to connect to the primary and copy
> some files. My questions are:
> 
>  1. What files does it need to copy? Config files? WAL files?

It copies files that are not replicated from the primary so that a
primary-style backup is created. Anything that is replicated (which is
by far the bulk of the data) is copied from the standby.

>  2. How does it connect? SSH?

Yes.

>  3. Does pgBackRest need to be installed and configured on the primary
>     as well?

Yes.  Anyway, it's best to archive from the primary so a replication
failure does not affect your archiving.

Configuring pgBackRest, SSH, standby, and backup from standby are all
covered in the user guide.

http://pgbackrest.org/user-guide.html

In particular:

http://pgbackrest.org/user-guide.html#backup-host
http://pgbackrest.org/user-guide.html#replication
http://pgbackrest.org/user-guide.html#standby-backup

Regards,
-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
David Steele
Дата:
On 2/18/18 10:20 PM, Michael Paquier wrote:
> On Sun, Feb 18, 2018 at 06:48:30PM -0600, Don Seiler wrote:
>> On Sun, Feb 18, 2018 at 6:43 PM, Michael Paquier <michael@paquier.xyz> wrote:
>>> You may want to contact the maintainers directly through github where
>>> the project is maintained:
>>> https://github.com/pgbackrest/pgbackrest
>>
>> Is that the place to just ask questions? I wasn't sure that "Issues" meant
>> just questions versus problems or bugs. I didn't see any mention of a forum
>> or list on their website, and there have been lots of pgBackRest questions
>> on this list in the past so I settled on this one.
>
> Stephen and David are around all the time, so it is not really an issue
> to discuss things related to pgBackRest on this list I guess :)
> Attaching related folks directly in CC: also usually helps.
>
> You may get faster feedback by opening an issue directly on github
> though, as there are a lot of emails on -general so it is easy to get
> things lost.  My 2c.

Either is fine with me, but as Michael says I might miss postings to
-general.  I'm sure somebody else would catch it, though.

--
-David
david@pgmasters.net


Вложения

Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 8:18 AM, David Steele <david@pgmasters.net> wrote:
It copies files that are not replicated from the primary so that a
primary-style backup is created. Anything that is replicated (which is
by far the bulk of the data) is copied from the standby.

OK so all data files would be copied from standby. Can you give me an example of the types of files that need to be copied from primary?
 
it's best to archive from the primary so a replication
failure does not affect your archiving.

Understood, just not something I can change in production primary at the moment. Hence looking to see about a quick one-off backup from standby.
 
Configuring pgBackRest, SSH, standby, and backup from standby are all
covered in the user guide.

Thanks, I've been through it a few times and played with some test backups from primary clones. I just ditched my master/replica clone setup but I'll test there as well. I just had a couple questions about the mechanics.



--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 8:23 AM, David Steele <david@pgmasters.net> wrote:

Either is fine with me, but as Michael says I might miss postings to
-general.  I'm sure somebody else would catch it, though.

OK, I'll make use of the issues tracker going forward.

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
David Steele
Дата:
Hi Don,

On 2/19/18 9:25 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 8:18 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
>     It copies files that are not replicated from the primary so that a
>     primary-style backup is created. Anything that is replicated (which is
>     by far the bulk of the data) is copied from the standby.
> 
> OK so all data files would be copied from standby. Can you give me an
> example of the types of files that need to be copied from primary?
>  

Anything *not* in global (except pg_control), base, pg_tblspc,
pg_xact/pg_clog, and pg_multixact are copied from the primary.

For example, pg_stat is copied from the primary so these stats are
preserved on a standby backup.

pgBackRest uses all the same exclusions as pg_basebackup, so many
dirs/files are not copied at all: pg_dynshmem, pg_notify, pg_replslot,
pg_serial, pg_snapshots, pg_stat_tmp, pg_subtrans, etc.

Full list here
https://www.postgresql.org/docs/10/static/protocol-replication.html.

>     it's best to archive from the primary so a replication
>     failure does not affect your archiving.
> 
> Understood, just not something I can change in production primary at the
> moment. Hence looking to see about a quick one-off backup from standby.

For a quick one-off, pg_basebackup is your friend.

Regards,
-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 8:53 AM, David Steele <david@pgmasters.net> wrote:
Anything *not* in global (except pg_control), base, pg_tblspc,
pg_xact/pg_clog, and pg_multixact are copied from the primary.

For example, pg_stat is copied from the primary so these stats are
preserved on a standby backup.

So if I have tablespaces outside of $PGDATA (but symlinked from within pg_tblspc, of course), those will still be backed up from the standby, right?

Is it right to say that the files that would be copied from primary are very small, typically? So it isn't a huge transfer over the WAN (in my case)?

 
pgBackRest uses all the same exclusions as pg_basebackup, so many
dirs/files are not copied at all: pg_dynshmem, pg_notify, pg_replslot,
pg_serial, pg_snapshots, pg_stat_tmp, pg_subtrans, etc.

Full list here
https://www.postgresql.org/docs/10/static/protocol-replication.html.

>     it's best to archive from the primary so a replication
>     failure does not affect your archiving.
>
> Understood, just not something I can change in production primary at the
> moment. Hence looking to see about a quick one-off backup from standby.

For a quick one-off, pg_basebackup is your friend.

One of the requirements of this backup is encryption, which I don't see any notes for with pg_basebackup. Also due to the size, parallel workers. pgBackRest gives me both of these. I need compression as well but that pg_basebackup does do.

I did come up with a sort of Rube Goldberg-esque workaround for now involving using a clone of the prod standby VM from Veeam backup to use as the backup source (after stopping recovery and opening it as a standalone DB).

Don. 

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
David Steele
Дата:
Hi Don,

On 2/19/18 10:01 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 8:53 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
>     Anything *not* in global (except pg_control), base, pg_tblspc,
>     pg_xact/pg_clog, and pg_multixact are copied from the primary.
> 
>     For example, pg_stat is copied from the primary so these stats are
>     preserved on a standby backup.
> 
> So if I have tablespaces outside of $PGDATA (but symlinked from within
> pg_tblspc, of course), those will still be backed up from the standby,
> right?

Correct.

> Is it right to say that the files that would be copied from primary are
> very small, typically? So it isn't a huge transfer over the WAN (in my
> case)?

Yes, they are typically very small.  The general exception to this rule
is if logs are stored in pg_log.  I recommend storing logs out of the
PGDATA dir as they can be quite large and don't really make sense to
restore to another server.

Files copied from the master will be marked as such in backup.manifest
(master:true) so you can check for yourself.

> I did come up with a sort of Rube Goldberg-esque workaround for now
> involving using a clone of the prod standby VM from Veeam backup to use
> as the backup source (after stopping recovery and opening it as a
> standalone DB).

You don't get PITR that way, of course, but at least it's a backup.  As
long as your clone is consistent.

-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 9:21 AM, David Steele <david@pgmasters.net> wrote:

Yes, they are typically very small.  The general exception to this rule
is if logs are stored in pg_log.  I recommend storing logs out of the
PGDATA dir as they can be quite large and don't really make sense to
restore to another server.

Files copied from the master will be marked as such in backup.manifest
(master:true) so you can check for yourself.

Good to know. And fortunately for this DB we do have pg_log (and pg_xlog) symlinked to different volumes outside of $PGDATA.

> I did come up with a sort of Rube Goldberg-esque workaround for now
> involving using a clone of the prod standby VM from Veeam backup to use
> as the backup source (after stopping recovery and opening it as a
> standalone DB).

You don't get PITR that way, of course, but at least it's a backup.  As
long as your clone is consistent.

Yes it's a crash-consistent snapshot-based backup. I've done quite a few restores from it and it works great. It can do PITR as well since I would have all the WAL files from prod needed to keep recovering. But for these cases I just recover it to the first consistent point and open it for testing (or backups in this case). 

Thanks for all your help!

Don.

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
David Steele
Дата:
On 2/19/18 10:32 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 9:21 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
> 
>     Yes, they are typically very small.  The general exception to this rule
>     is if logs are stored in pg_log.  I recommend storing logs out of the
>     PGDATA dir as they can be quite large and don't really make sense to
>     restore to another server.
> 
>     Files copied from the master will be marked as such in backup.manifest
>     (master:true) so you can check for yourself.
> 
> 
> Good to know. And fortunately for this DB we do have pg_log (and
> pg_xlog) symlinked to different volumes outside of $PGDATA.

If pg_log is symlinked to PGDATA it will be copied.  pg_xlog is not
copied in any backup.

> 
>     > I did come up with a sort of Rube Goldberg-esque workaround for now
>     > involving using a clone of the prod standby VM from Veeam backup to use
>     > as the backup source (after stopping recovery and opening it as a
>     > standalone DB).
> 
>     You don't get PITR that way, of course, but at least it's a backup.  As
>     long as your clone is consistent.
> 
> 
> Yes it's a crash-consistent snapshot-based backup. I've done quite a few
> restores from it and it works great. It can do PITR as well since I
> would have all the WAL files from prod needed to keep recovering. But
> for these cases I just recover it to the first consistent point and open
> it for testing (or backups in this case). 

I don't think it would be safe to do PITR on a backup taken in this way.
 The WAL diverges even if you suppress a timeline switch.

-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Simon Riggs
Дата:
On 19 February 2018 at 16:17, David Steele <david@pgmasters.net> wrote:
>>     > I did come up with a sort of Rube Goldberg-esque workaround for now
>>     > involving using a clone of the prod standby VM from Veeam backup to use
>>     > as the backup source (after stopping recovery and opening it as a
>>     > standalone DB).
>>
>>     You don't get PITR that way, of course, but at least it's a backup.  As
>>     long as your clone is consistent.
>>
>>
>> Yes it's a crash-consistent snapshot-based backup. I've done quite a few
>> restores from it and it works great. It can do PITR as well since I
>> would have all the WAL files from prod needed to keep recovering. But
>> for these cases I just recover it to the first consistent point and open
>> it for testing (or backups in this case).
>
> I don't think it would be safe to do PITR on a backup taken in this way.

If you have all the WAL files, then it would be safe.

>  The WAL diverges even if you suppress a timeline switch.

Which is exactly why we have timelines.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pgBackRest backup from standby

От
David Steele
Дата:
On 2/19/18 11:29 AM, Simon Riggs wrote:
> On 19 February 2018 at 16:17, David Steele <david@pgmasters.net> wrote:
>>>     > I did come up with a sort of Rube Goldberg-esque workaround for now
>>>     > involving using a clone of the prod standby VM from Veeam backup to use
>>>     > as the backup source (after stopping recovery and opening it as a
>>>     > standalone DB).
>>>
>>>     You don't get PITR that way, of course, but at least it's a backup.  As
>>>     long as your clone is consistent.
>>>
>>>
>>> Yes it's a crash-consistent snapshot-based backup. I've done quite a few
>>> restores from it and it works great. It can do PITR as well since I
>>> would have all the WAL files from prod needed to keep recovering. But
>>> for these cases I just recover it to the first consistent point and open
>>> it for testing (or backups in this case).
>>
>> I don't think it would be safe to do PITR on a backup taken in this way.
> 
> If you have all the WAL files, then it would be safe.

I read "open it for testing (or backups in this case)" as letting
recovery complete and promoting the cluster to a master before taking
the backup.

Don, is that the case?  If it is, I think there's a problem with or
without a timeline switch.  If you confirm the backup is being taken as
above then I'll detail my concerns.

-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 12:39 PM, David Steele <david@pgmasters.net> wrote:

I read "open it for testing (or backups in this case)" as letting
recovery complete and promoting the cluster to a master before taking
the backup.

Don, is that the case?  If it is, I think there's a problem with or
without a timeline switch.  If you confirm the backup is being taken as
above then I'll detail my concerns.

Note that this is just for creating a couple of one-off backups to restore for our dev and pre-prod environments. Given that, I was going to open a new clone as its own cluster and take backups from that. The data would be the same though and suit purposes of the dev and pre-prod refreshes.

If I were taking backups for the purpose of production backups, I would not do things this way. That is the eventual plan but right now we aren't ready to make the changes necessary in the production environment.

--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
David Steele
Дата:
On 2/19/18 2:05 PM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 12:39 PM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
> 
>     I read "open it for testing (or backups in this case)" as letting
>     recovery complete and promoting the cluster to a master before taking
>     the backup.
> 
>     Don, is that the case?  If it is, I think there's a problem with or
>     without a timeline switch.  If you confirm the backup is being taken as
>     above then I'll detail my concerns.
> 
> 
> Note that this is just for creating a couple of one-off backups to
> restore for our dev and pre-prod environments. Given that, I was going
> to open a new clone as its own cluster and take backups from that. The
> data would be the same though and suit purposes of the dev and pre-prod
> refreshes.
> 
> If I were taking backups for the purpose of production backups, I would
> not do things this way. That is the eventual plan but right now we
> aren't ready to make the changes necessary in the production environment.

OK, that's fine then.  You can play these to consistency and they'll be
fine.  I just wouldn't try to do any PITR using the production WAL archive.

-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 10:17 AM, David Steele <david@pgmasters.net> wrote:
If pg_log is symlinked to PGDATA it will be copied.  pg_xlog is not
copied in any backup.

So an external pg_log directory symlinked into $PGDATA will have its log contents copied?

I'm curious, why even copy the pg_log logs? They aren't needed for database restore or recovery.

Don. 
--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
David Steele
Дата:
On 2/19/18 3:41 PM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 10:17 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
>     If pg_log is symlinked to PGDATA it will be copied.  pg_xlog is not
>     copied in any backup.
> 
> 
> So an external pg_log directory symlinked into $PGDATA will have its log
> contents copied?

Yes.

> I'm curious, why even copy the pg_log logs? They aren't needed for
> database restore or recovery.

The general philosophy is to copy everything except what we know for
sure can be excluded.  In practice, this means sticking to what
pg_basebackup excludes because that list is vetted by the community.

Also, relocating the log directory is easy using the log_directory
setting, so that's what I recommend if it's an issue.  Some users do
want to backup their logs.

-- 
-David
david@pgmasters.net


Re: pgBackRest backup from standby

От
Don Seiler
Дата:
On Mon, Feb 19, 2018 at 2:53 PM, David Steele <david@pgmasters.net> wrote:

Also, relocating the log directory is easy using the log_directory
setting, so that's what I recommend if it's an issue.  Some users do
want to backup their logs.


That's probably a lot better idea than symlinking anyway. I'll look to do that in my next round of config changes.



--
Don Seiler
www.seiler.us

Re: pgBackRest backup from standby

От
Martin Marques
Дата:
El 19/02/18 a las 17:53, David Steele escribió:
> On 2/19/18 3:41 PM, Don Seiler wrote:
> 
>> I'm curious, why even copy the pg_log logs? They aren't needed for
>> database restore or recovery.
> 
> Also, relocating the log directory is easy using the log_directory
> setting, so that's what I recommend if it's an issue.  Some users do
> want to backup their logs.

I was about to reply to Don about this, but David's reply was very clear.

If you don't want the logs inside pgdata (a very wise idea, specially if
you place them on a different partition), just change the configuration GUC.

Regards,

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services