Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
От | Michael Paquier |
---|---|
Тема | Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master |
Дата | |
Msg-id | CAB7nPqRFfP_sVDKWfSkg9rywFZs+Kq6D2hRVK2iOWaZn8FYjTw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master (Fujii Masao <masao.fujii@gmail.com>) |
Ответы |
Re: BUG #13368: standby cluster immediately promotes after
pg_basebackup from previously promoted master
(Fujii Masao <masao.fujii@gmail.com>)
|
Список | pgsql-bugs |
On Fri, Jun 5, 2015 at 11:06 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Jun 5, 2015 at 3:01 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> >> >> On Wed, Jun 3, 2015 at 1:04 AM, Fujii Masao wrote: >>> >>> On Mon, Jun 1, 2015 at 5:19 PM, Michael Paquier >>> >> Some testing shows us that in some cases, when pg_ctl promote is called >>> >> multiple >>> >> times, a promote file is left in the PGDATA directory, even though the >>> >> cluster >>> >> has been succesfully promoted and is accepting read/write queries. >>> > >>> > This is not surprising, pg_ctl bases its analysis that a node needs to >>> > be promoted if recovery.conf exists or not, and there is an interval >>> > of time between which recovery.conf is removed and the promotion is >>> > actually triggered, so you can create a promote file even after even >>> > sending SIGUSR1 to the standby's postmaster >>> > >>> >> We will try to workaround this issue by ensuring we do not send >>> >> multiple >>> >> promote request using pg_ctl to the same cluster. >>> > >>> > Well, we could for example have the server switch promote to >>> > promote_done in CheckForStandbyTrigger() and then unlink it when >>> > recovery.conf is switched to .done. Opinions are welcome on the >>> > matter. >>> >>> Or we can just always remove the signal file at the end of recovery. >>> That filename switch seems unnecessary. >> >> >> Well, by doing so, in the event of a crash during recovery the promote >> signal file would be present in PGDATA, and this would enforce a promotion >> at the next startup of the node. I don't think that this is a good idea. In >> the case of a promoted node crash a user may want to look at his node back >> in a recovery state. > > You meant the case of crash which occurs before CheckForStandbyTrigger() > removes the signal file after pg_ctl promote is executed? If yes, even if > we rename the file to the intermediate one, the signal file would remain. > > If we want to address the above corner case, we can additionally remove > the file always at the beginning of recovery. This idea can completely avoid > an unexpected promotion by the surviving signal file. Then what about the case where a promote file is let by user on purpose to trigger a promotion on restart? >> Also, this intermediate promote file, let's say promote.detected, would be >> useful for external tools to let them know that the promotion has been >> acknoledged (you can already know it if your tool knows that a promote has >> been triggered, that promote has been removed by the server and if >> recovery.conf is still present). That's not something you would want on back >> branches btw as this changes how promotion bevahes seen from an external >> point of view. But that would be a patch simple enough (got a WIP for people >> wondering). >> >> An open question would be what to do with pg_ctl promote if a promote file >> already exists. I think that we should ignore the creation of the promote >> file but still kick the signal SIGUSR1. >> >>> >>> In addition to that change, we should make pg_basebackup skip >>> the signal file? >> >> >> Well, yes, and it we would be just fine for the case reported by Feike to >> just ignore promote and fallback_promote in a base backup, as the problem >> reported was about a standby that contained the signal promote file after >> pg_basebackup. And I think that we would be fine by doing that as well in >> the back-branches. trigger_file is not exposed out of xlog.c in the startup >> process, but I can live with the fact that it is not ignored. >> >> In short, I guess that the patch attached would be fine. >> Opinions? > > I have no strong objection to that change, but it seems half-baked. > That is, that idea doesn't address the case where a base backup is > taken by other than pg_basebackup at all. That's the same problem with for example postmaster.pid, postmaster.opts or similar when taking a FS-level backup. -- Michael
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: BUG #13404: Docs do not mention "access/htup_details.h" for C functions using heap_form_tuple
Следующее
От: Michael PaquierДата:
Сообщение: Re: BUG #13400: Unable to connect postgresql using remote machine