Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standbyserver
От | David Steele |
---|---|
Тема | Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standbyserver |
Дата | |
Msg-id | bdd9cf10-879b-7066-dbc7-1f3983627bf2@pgmasters.net обсуждение исходный текст |
Ответ на | Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standbyserver (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standbyserver
(David Steele <david@pgmasters.net>)
|
Список | pgsql-hackers |
On 7/24/17 12:28 PM, Stephen Frost wrote: > > * David Steele (david@pgmasters.net) wrote: > >> While this patch brings pg_stop_backup() in line with the >> documentation, it also introduces a behavioral change compared to >> 9.6. Currently, the default 9.6 behavior on a standby is to return >> immediately from pg_stop_backup(), but this patch will cause it to >> block by default instead. Since action on the master may be >> required to unblock the process, I see this as a pretty significant >> change. I still think we should fix and backpatch, but I wanted to >> point this out. > > This will need to be highlighted in the release notes for 9.6.4 also, > assuming there is agreement to back-patch this, and we'll need to update > the documentation in 9.6 also to be clearer about what happens on a > standby. Agreed. >> "If the WAL segments required for this backup have not been archived >> then it might be necessary to force a segment switch on the >> primary." > > I'm a bit on the fence regarding the distinction here for the > backup-from-standby and this errhint(). The issue is that the WAL for > the backup hasn't been archived yet and that may be because of either: > > archive_command is failing > OR > No WAL is getting generated > > In either case, both of these apply to the primary and the standby. As > such, I'm inclined to try to mention both in the original errhint() > instead of making two different ones. I'm open to suggestions on this, > of course, but my thinking would be more like: > > ----- > Either archive_command is failing or not enough WAL has been generated > to require a segment switch. Run pg_switch_wal() to request a WAL > switch and monitor your logs to check that your archive_command is > executing properly. > ----- Yes, that seems more concise. I like the idea of not having to maintain two separate hints. > And then change pg_switch_wal()'s errhint for when it's run on a replica > to be: > > ---- > HINT: WAL control functions cannont be executed during recovery; they > should be executed on the primary instead. > ---- Looks good to me. This explanation is useful in general. >> 2) At backup.sgml L1015 it says that pg_stop_backup() will >> automatically switch the WAL segment. There should be a caveat here >> for standby backups, like: >> >> When called on a master it terminates the backup mode and performs >> an automatic switch to the next WAL segment. The reason for the >> switch is to arrange for the last WAL segment written during the >> backup interval to be ready to archive. When called on a standby >> the WAL segment switch must be performed manually on the master if >> it does not happen due to normal write activity. > > s/master/primary/g Yes. >> 3) The fact that this fix requires "archive_mode = always" seems >> like a pretty big caveat, thought I don't have any ideas on how to >> make it better without big code changes. Maybe it would be help to >> change: >> >> + the backup is taken on a standby, <function>pg_stop_backup</> waits >> + for WAL to be archived when <varname>archive_mode</> >> >> To: >> >> + the backup is taken on a standby, <function>pg_stop_backup</> waits >> + for WAL to be archived *only* when <varname>archive_mode</> > > I'm thinking of rewording that a bit to say "When archive_mode = always" > instead, as I think that might be a little clearer. Sure. >> Perhaps this should be noted in the pg_basebackup docs as well? > > That seems like it's probably a good idea too, as people might wonder > why pg_basebackup hasn't finished yet if it's waiting for WAL to be > archived. Yes, and this is another behavioral change to consider -- one that is more likely to impact users than the change to pg_stop_backup(). If pg_basebackup is not currently waiting for WAL on a standby (but does on a primary) then that's pretty serious. Any thoughts on this, Magnus? -- -David david@pgmasters.net
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Thomas MunroДата:
Сообщение: Re: [HACKERS] Buildfarm failure and dubious coding in predicate.c
Следующее
От: Michael PaquierДата:
Сообщение: Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server