Обсуждение: test: avoid redundant standby catchup in 049_wait_for_lsn
Hi Alexander, Hackers, While working on adding more edge-case tests and fixing the timeline handling for WAIT FOR LSN, I noticed that the overall runtime of the test had increased by about 7 seconds since a8b61c23c5ff. I looked into the slowdown and found a potential source. Currently, the test creates the function, waits for the standby to catch up, tests it, then creates the procedure and waits for the standby to catch up again. Since both objects are only used by the same block of top-level statement checks, we can create them together in a single primary-side transaction and perform just one wait_for_catchup() before running both standby-side calls. This small TAP cleanup merges the creation of the PL/pgSQL wrapper function and procedure used for the top-level WAIT FOR checks in 049_wait_for_lsn.pl. The change preserves the same coverage while removing one redundant replay catch-up on the delayed standby. It appears to reduce the test runtime by about 7 seconds, though I have looked into why much of the improvement comes from this change alone. Patch attached. Thanks. -- Best, Xuneng
Вложения
On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote: > The change preserves the same coverage while removing one redundant > replay catch-up on the delayed standby. It appears to reduce the test > runtime by about 7 seconds, though I have looked into why much of the > improvement comes from this change alone. Alexander may think differently and remove that, but I disagree. The test is clearly written so as we want two wait checks to happen, for for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first check to keep only the second one removes its meaning. In short, I see nothing wrong to deal with here. -- Michael
Вложения
Hi Michael,
On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:
> The change preserves the same coverage while removing one redundant
> replay catch-up on the delayed standby. It appears to reduce the test
> runtime by about 7 seconds, though I have looked into why much of the
> improvement comes from this change alone.
Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.
Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivation for this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, while the creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-up step. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why the wait is so costly in this case. I should probably look into that before proposing this change.
Best,
Xuneng
Xuneng
Hi, Xuneng. On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: >> On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote: >> > The change preserves the same coverage while removing one redundant >> > replay catch-up on the delayed standby. It appears to reduce the test >> > runtime by about 7 seconds, though I have looked into why much of the >> > improvement comes from this change alone. >> >> Alexander may think differently and remove that, but I disagree. The >> test is clearly written so as we want two wait checks to happen, for >> for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first >> check to keep only the second one removes its meaning. In short, I >> see nothing wrong to deal with here. > > > Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivationfor this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, whilethe creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-upstep. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why thewait is so costly in this case. I should probably look into that before proposing this change. On my laptop the time needed to run t/049_wait_for_lsn.pl also drops from 20 secs to 12 secs. The influence to the runtime of the whole test suite in parallel would be not that big as CPU time only drops from 2.16 sec to 2.07 sec. But anyway that's pretty significant. I've revised comment message a bit and surrounding comments. I'm going to push this if no objections. ------ Regards, Alexander Korotkov Supabase
Вложения
Hi, Michael! On Sat, Apr 18, 2026 at 12:47 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote: > > The change preserves the same coverage while removing one redundant > > replay catch-up on the delayed standby. It appears to reduce the test > > runtime by about 7 seconds, though I have looked into why much of the > > improvement comes from this change alone. > > Alexander may think differently and remove that, but I disagree. The > test is clearly written so as we want two wait checks to happen, for > for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first > check to keep only the second one removes its meaning. In short, I > see nothing wrong to deal with here. Thank you for your observation. The intention of this test is to check explicit calls to WAIT FOR LSN. Yes, wait_for_catchup() now also internally calls WAIT FOR LSN. But checking wait_for_catchup() is not intention of this test, it's used in awfully a lot of other places. ------ Regards, Alexander Korotkov Supabase
On Sat, Apr 18, 2026 at 10:58 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > >> On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote: > >> > The change preserves the same coverage while removing one redundant > >> > replay catch-up on the delayed standby. It appears to reduce the test > >> > runtime by about 7 seconds, though I have looked into why much of the > >> > improvement comes from this change alone. > >> > >> Alexander may think differently and remove that, but I disagree. The > >> test is clearly written so as we want two wait checks to happen, for > >> for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first > >> check to keep only the second one removes its meaning. In short, I > >> see nothing wrong to deal with here. > > > > > > Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivationfor this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, whilethe creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-upstep. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why thewait is so costly in this case. I should probably look into that before proposing this change. > > On my laptop the time needed to run t/049_wait_for_lsn.pl also drops > from 20 secs to 12 secs. The influence to the runtime of the whole > test suite in parallel would be not that big as CPU time only drops > from 2.16 sec to 2.07 sec. But anyway that's pretty significant. > I've revised comment message a bit and surrounding comments. I'm > going to push this if no objections. Pushed. ------ Regards, Alexander Korotkov Supabase
On Mon, Apr 20, 2026 at 6:21 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sat, Apr 18, 2026 at 10:58 AM Alexander Korotkov > <aekorotkov@gmail.com> wrote: > > On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > >> On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote: > > >> > The change preserves the same coverage while removing one redundant > > >> > replay catch-up on the delayed standby. It appears to reduce the test > > >> > runtime by about 7 seconds, though I have looked into why much of the > > >> > improvement comes from this change alone. > > >> > > >> Alexander may think differently and remove that, but I disagree. The > > >> test is clearly written so as we want two wait checks to happen, for > > >> for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first > > >> check to keep only the second one removes its meaning. In short, I > > >> see nothing wrong to deal with here. > > > > > > > > > Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The mainmotivation for this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, whilethe creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-upstep. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why thewait is so costly in this case. I should probably look into that before proposing this change. > > > > On my laptop the time needed to run t/049_wait_for_lsn.pl also drops > > from 20 secs to 12 secs. The influence to the runtime of the whole > > test suite in parallel would be not that big as CPU time only drops > > from 2.16 sec to 2.07 sec. But anyway that's pretty significant. > > I've revised comment message a bit and surrounding comments. I'm > > going to push this if no objections. > > Pushed. > Thanks for pushing it. I haven't had time to investigate the latency yet, but will do it later. Best, Xuneng