Re: [PoC] pg_upgrade: allow to upgrade publisher node
От | Julien Rouhaud |
---|---|
Тема | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Дата | |
Msg-id | 20230407152944.j3rek4zyrzggcij7@jrouhaud обсуждение исходный текст |
Ответ на | RE: [PoC] pg_upgrade: allow to upgrade publisher node ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Ответы |
RE: [PoC] pg_upgrade: allow to upgrade publisher node
|
Список | pgsql-hackers |
On Fri, Apr 07, 2023 at 09:40:14AM +0000, Hayato Kuroda (Fujitsu) wrote: > > > As I mentioned in my original thread, I'm not very familiar with that code, but > > I'm a bit worried about "all the changes generated on publisher must be send > > and applied". Is that a hard requirement for the feature to work reliably? > > I think the requirement is needed because the existing WALs on old node cannot be > transported on new instance. The WAL hole from confirmed_flush to current position > could not be filled by newer instance. I see, that was also the first blocker I could think of when Amit mentioned that feature weeks ago and I also don't see how that whole could be filled either. > > If > > yes, how does this work if some subscriber node isn't connected when the > > publisher node is stopped? I guess you could add a check in pg_upgrade to make > > sure that all logical slot are indeed caught up and fail if that's not the case > > rather than assuming that a clean shutdown implies it. It would be good to > > cover that in the TAP test, and also cover some corner cases, like any new row > > added on the publisher node after the pg_upgrade but before the subscriber is > > reconnected is also replicated as expected. > > Hmm, good point. Current patch could not be handled the case because walsenders > for the such slots do not exist. I have tested your approach, however, I found that > CHECKPOINT_SHUTDOWN record were generated twice when publisher was > shutted down and started. It led that the confirmed_lsn of slots always was behind > from WAL insert location and failed to upgrade every time. > Now I do not have good idea to solve it... Do anyone have for this? I'm wondering if we could just check that each slot's LSN is exactly sizeof(CHECKPOINT_SHUTDOWN) ago or something like that? That's hackish, but if pg_upgrade can run it means it was a clean shutdown so it should be safe to assume that what's the last record in the WAL was. For the double shutdown checkpoint, I'm not sure that I get the problem. The check should only be done at the very beginning of pg_upgrade, so there should have been only one shutdown checkpoint done right?
В списке pgsql-hackers по дате отправления: