Обсуждение: walsender.c comment with no context is hard to understand
Hi, I was looking at this code comment and wondered what it meant. AFAICT over time code has been moved around causing comments to lose their original context, so now it is hard to understand what they are saying. ~~~ After a 2017 patch [1] the code in walsender.c function logical_read_xlog_page() looked like this: /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); /* fail if not (implies we are going to shut down) */ if (flushptr < targetPagePtr + reqLen) return -1; ~~~ The same code in HEAD now looks like this: /* * Make sure we have enough WAL available before retrieving the current * timeline. This is needed to determine am_cascading_walsender accurately * which is needed to determine the current timeline. */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); /* * Since logical decoding is also permitted on a standby server, we need * to check if the server is in recovery to decide how to get the current * timeline ID (so that it also cover the promotion or timeline change * cases). */ am_cascading_walsender = RecoveryInProgress(); if (am_cascading_walsender) GetXLogReplayRecPtr(&currTLI); else currTLI = GetWALInsertionTimeLine(); XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI); sendTimeLineIsHistoric = (state->currTLI != currTLI); sendTimeLine = state->currTLI; sendTimeLineValidUpto = state->currTLIValidUntil; sendTimeLineNextTLI = state->nextTLI; /* fail if not (implies we are going to shut down) */ if (flushptr < targetPagePtr + reqLen) return -1; ~~~ Notice how the "fail if not" comment has become distantly separated from the flushptr assignment it was once adjacent to, so that comment hardly makes sense anymore -- e.g. "fail if not" WHAT? Perhaps the comment should say something like it used to: /* Fail if there is not enough WAL available. This can happen during shutdown. */ ====== [1] https://github.com/postgres/postgres/commit/fca85f8ef157d4d58dea1fdc8e1f1f957b74ee78 Kind Regards, Peter Smith. Fujitsu Australia
On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi, > > I was looking at this code comment and wondered what it meant. AFAICT > over time code has been moved around causing comments to lose their > original context, so now it is hard to understand what they are > saying. > > ~~~ > > After a 2017 patch [1] the code in walsender.c function > logical_read_xlog_page() looked like this: > > /* make sure we have enough WAL available */ > flushptr = WalSndWaitForWal(targetPagePtr + reqLen); > > /* fail if not (implies we are going to shut down) */ > if (flushptr < targetPagePtr + reqLen) > return -1; > > ~~~ > > The same code in HEAD now looks like this: > > /* > * Make sure we have enough WAL available before retrieving the current > * timeline. This is needed to determine am_cascading_walsender accurately > * which is needed to determine the current timeline. > */ > flushptr = WalSndWaitForWal(targetPagePtr + reqLen); > > /* > * Since logical decoding is also permitted on a standby server, we need > * to check if the server is in recovery to decide how to get the current > * timeline ID (so that it also cover the promotion or timeline change > * cases). > */ > am_cascading_walsender = RecoveryInProgress(); > > if (am_cascading_walsender) > GetXLogReplayRecPtr(&currTLI); > else > currTLI = GetWALInsertionTimeLine(); > > XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI); > sendTimeLineIsHistoric = (state->currTLI != currTLI); > sendTimeLine = state->currTLI; > sendTimeLineValidUpto = state->currTLIValidUntil; > sendTimeLineNextTLI = state->nextTLI; > > /* fail if not (implies we are going to shut down) */ > if (flushptr < targetPagePtr + reqLen) > return -1; > > ~~~ > > Notice how the "fail if not" comment has become distantly separated > from the flushptr assignment it was once adjacent to, so that comment > hardly makes sense anymore -- e.g. "fail if not" WHAT? > > Perhaps the comment should say something like it used to: > /* Fail if there is not enough WAL available. This can happen during > shutdown. */ Agree with this, +1 for this change. Regards, Vignesh
On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: >> Perhaps the comment should say something like it used to: >> /* Fail if there is not enough WAL available. This can happen during >> shutdown. */ > > Agree with this, +1 for this change. That would be an improvement. Would you like to send a patch with all the areas you think could stand for improvements? -- Michael
Вложения
On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > >> Perhaps the comment should say something like it used to: > >> /* Fail if there is not enough WAL available. This can happen during > >> shutdown. */ > > > > Agree with this, +1 for this change. > > That would be an improvement. Would you like to send a patch with all > the areas you think could stand for improvements? > -- OK, I attached a patch equivalent of the suggestion in this thread. ====== Kind Regards, Peter Smith. Fujitsu Australia
Вложения
On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > > >> Perhaps the comment should say something like it used to: > > >> /* Fail if there is not enough WAL available. This can happen during > > >> shutdown. */ > > > > > > Agree with this, +1 for this change. > > > > That would be an improvement. Would you like to send a patch with all > > the areas you think could stand for improvements? > > -- > > OK, I attached a patch equivalent of the suggestion in this thread. > Shouldn't the check for flushptr (if (flushptr < targetPagePtr + reqLen)) be moved immediately after the call to WalSndWaitForWal(). The comment seems to suggests the same: "Make sure we have enough WAL available before retrieving the current timeline. .." -- With Regards, Amit Kapila.
On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > > > >> Perhaps the comment should say something like it used to: > > > >> /* Fail if there is not enough WAL available. This can happen during > > > >> shutdown. */ > > > > > > > > Agree with this, +1 for this change. > > > > > > That would be an improvement. Would you like to send a patch with all > > > the areas you think could stand for improvements? > > > -- > > > > OK, I attached a patch equivalent of the suggestion in this thread. > > > > Shouldn't the check for flushptr (if (flushptr < targetPagePtr + > reqLen)) be moved immediately after the call to WalSndWaitForWal(). > The comment seems to suggests the same: "Make sure we have enough WAL > available before retrieving the current timeline. .." > Yes, as I wrote in the first post, those lines did once used to be adjacent in logical_read_xlog_page. I also wondered if they still belonged together, but I opted for the safest option of fixing only the comment instead of refactoring old code when no problem had been reported. AFAICT these lines became separated due to a 2023 patch [1], and you were one of the reviewers of that patch, so I assumed the code separation was deemed OK at that time. Unless it was some mistake that slipped past multiple reviewers? ====== [1] https://github.com/postgres/postgres/commit/0fdab27ad68a059a1663fa5ce48d76333f1bd74c#diff-da7052ce339ec037f5c721e08a9532b1fcfb4405966cf6a78d0c811550fd7b43 Kind Regards, Peter Smith. Fujitsu Australia
On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > > > > >> Perhaps the comment should say something like it used to: > > > > >> /* Fail if there is not enough WAL available. This can happen during > > > > >> shutdown. */ > > > > > > > > > > Agree with this, +1 for this change. > > > > > > > > That would be an improvement. Would you like to send a patch with all > > > > the areas you think could stand for improvements? > > > > -- > > > > > > OK, I attached a patch equivalent of the suggestion in this thread. > > > > > > > Shouldn't the check for flushptr (if (flushptr < targetPagePtr + > > reqLen)) be moved immediately after the call to WalSndWaitForWal(). > > The comment seems to suggests the same: "Make sure we have enough WAL > > available before retrieving the current timeline. .." > > > > Yes, as I wrote in the first post, those lines did once used to be > adjacent in logical_read_xlog_page. > > I also wondered if they still belonged together, but I opted for the > safest option of fixing only the comment instead of refactoring old > code when no problem had been reported. > > AFAICT these lines became separated due to a 2023 patch [1], and you > were one of the reviewers of that patch, so I assumed the code > separation was deemed OK at that time. Unless it was some mistake that > slipped past multiple reviewers? > I don't know whether your assumption is correct. AFAICS, those two lines should be together. Let us ee if Bertrand remembers anything? -- With Regards, Amit Kapila.
Hi, On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote: > On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote: > > > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote: > > > > > >> Perhaps the comment should say something like it used to: > > > > > >> /* Fail if there is not enough WAL available. This can happen during > > > > > >> shutdown. */ > > > > > > > > > > > > Agree with this, +1 for this change. > > > > > > > > > > That would be an improvement. Would you like to send a patch with all > > > > > the areas you think could stand for improvements? > > > > > -- > > > > > > > > OK, I attached a patch equivalent of the suggestion in this thread. > > > > > > > > > > Shouldn't the check for flushptr (if (flushptr < targetPagePtr + > > > reqLen)) be moved immediately after the call to WalSndWaitForWal(). > > > The comment seems to suggests the same: "Make sure we have enough WAL > > > available before retrieving the current timeline. .." > > > > > > > Yes, as I wrote in the first post, those lines did once used to be > > adjacent in logical_read_xlog_page. > > > > I also wondered if they still belonged together, but I opted for the > > safest option of fixing only the comment instead of refactoring old > > code when no problem had been reported. > > > > AFAICT these lines became separated due to a 2023 patch [1], and you > > were one of the reviewers of that patch, so I assumed the code > > separation was deemed OK at that time. Unless it was some mistake that > > slipped past multiple reviewers? > > > > I don't know whether your assumption is correct. AFAICS, those two > lines should be together. Let us ee if Bertrand remembers anything? > IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine the timeline accurately. I agree with Amit that it would make more sense to move the (flushptr < targetPagePtr + reqLen) check just after the flushptr assignement. I don't recall that we discussed any reason of not doing so. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com