Re: [BUG] pg_basebackup from disconnected standby fails
От | Kyotaro HORIGUCHI |
---|---|
Тема | Re: [BUG] pg_basebackup from disconnected standby fails |
Дата | |
Msg-id | 20160721.171950.15026313.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: [BUG] pg_basebackup from disconnected standby fails (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: [BUG] pg_basebackup from disconnected standby fails
(Michael Paquier <michael.paquier@gmail.com>)
|
Список | pgsql-hackers |
Sorry for the absense. I've reached here. At Thu, 21 Jul 2016 12:20:30 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqR+krKK9fqwquSJnA2oB-RcQ6Mi8ybz3WL0kCOB5SSZEg@mail.gmail.com> > On Thu, Jul 21, 2016 at 11:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jul 21, 2016 at 7:28 AM, Michael Paquier > > <michael.paquier@gmail.com> wrote: > >> On Wed, Jul 20, 2016 at 8:56 PM, Michael Paquier > >> <michael.paquier@gmail.com> wrote: > >>>> > >>>> Yeah, I think that is totally different angle to fix this issue, so > >>>> don't you think it is better to start a separate thread to discuss > >>>> about it for 10.0 and mark this patch as ready for committer. > >>> > >>> I'd like to tackle this problem in 10.0, but that will strongly depend > >>> on how my patches move on in CF1 and CF2. > >> > >> By the way, thank you for taking the time to provide input. I think > >> we're in good shape here now. > >> > > > > So, if I understand correctly, then we can mark the version posted by > > you upthread [1] which includes a test along with Kyotaro's fix can be > > marked as Ready for committer. If so, then please change the status > > of patch accordingly. > > Oops. I thought you did it already. So done. Thank you very much for the intensive discussion on this. After some additional consideration, I attached two patches about comment and documentation. 0001- is a patch related to the previous patch that adds description on why we use replay end point instead of minRecoveryPoint. And 0002- is a fix and an additional mention on what would happen when pg_control is not copied last during backup. ==== About the first patch. Related to the previous patch, I found the following comment just above where it applies. xlog.c:10412 - * We return the current minimum recovery point as the backup end - * location. Note that it can be greater than the exact backup end - * location if the minimum recovery point is updated after the backup of - * pg_control. This is harmless for current uses. I haven't gave a notice on this but this seems to be necessary to edit so as to mention this fix and the gap like the following. + * minRecoveryPoint can go behind the last checkpoint's redo location when + * the checkpoint writes out no buffer. This does no harm to performing a + * recovery but such inversion seems inconsistent from the view of the + * callers and prevents them from knowing WAL segments needed by sane + * calcuation. For the reason we return the last replayed point as the + * backup end location. Note that it can be greater than the exact backup + * end location or even the minimum recovery point of pg_control at the + * time. This is harmless for current uses. ==== About the second patch. By the way, related to the following discussion in the upthread, At Tue, 19 Jul 2016 14:13:36 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqRXMkR2rq7_Wa1aZZqBFVRUwU_5QTBr_kEQqWGUEjAaAQ@mail.gmail.com> > The thing that is really annoying btw is that there will be always a > gap between minRecoveryPoint and the actual moment where a backup > finishes because there is no way to rely on the XLOG_BACKUP_END > record. On top of that we can not be sure if pg_control has been > backed up last or not. Which is why it would be cool to document that > gap. Even with out my previous patch, the gap between minRecoveryPoint *when pg_control is backed up* and that (or the replay end) at the end of backup is crucial and should be back-patched. Addition to that, while I examined the documentation I found the following description about pg_stop_backup in continuous-archiving.html, which contradicts to its definition. > In the same connection as before, issue the command: > > SELECT * FROM pg_stop_backup(false); ... > The pg_stop_backup will return one row with three values. AFAICS pg_stop_backup returns a single value of LSN. I don't know where this comes from, but the attached patch fixes this and adds a mention on the "gap". The original description mentioned backup_label and tablespace_map but it seems not necessary. The following is the new content rewritten by the patch. | The pg_stop_backup will return the LSN when it is called. This | LSN informs upto where the backup needs WAL segment files to | complete. For a backup taken from a master, this function leaves | a backup history file to inform the segment files needed and a | server starts from the backup performes a recovery up to where a | backup-end WAL record to reach a consistent state. But things are | different for a backup taken from a standby. For the case, backup | history file won't be created and recvoery is perfomed up to the | Minimum-recovery-ending-location field in the pg_control file | included in the backup instead. The location is properly stored | in the backup if you copy the the pg_control file after all the | other files as pg_basebackup does. Note that a hot standby | started from a backup not taken in that manner will consider to | have reached a consistent state mistakenly earlier. | | Once the WAL segment files active during the backup are archived, | you are done. The file contains the LSN returned by | pg_stop_backup is the last segment that is required to form a | complete set of backup files. The "LSN" in the first line is intentionally ambiguated. It may seems too much to write here. > Another crazy idea would be to return pg_control as an extra > return field of pg_stop_backup() and encourage users to write that > back in the backup itself. This would allow closing any hole in the > current logic for backups taken from live standbys: minRecoveryPoint > would be updated directly to the last replayed LSN/TLI in the control > file. It seems to be doable at do_pg_*_backup level. non-exclusive backup code does the same thing to backup_label but no usage seen. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Amit KapilaДата:
Сообщение: Re: PoC: Make it possible to disallow WHERE-less UPDATE and DELETE