Обсуждение: pg_createsubscriber --dry-run logging concerns
Lately, I've been reviewing some pg_createsubscriber patches and have been tricked by some of the logging. The pg_createsubscriber has a '--dry-run' option/mode [1] ---------- --dry-run Do everything except actually modifying the target directory. ---------- I've noticed that the logging in '--dry-run' mode is indistinguishable from the logging of "normal" run, although functions like create_publication(), drop_publication(), etc, are NOP in '--dry-run' mode, because the actual command execution code is skipped. e.g. if (!dry_run) { res = PQexec(conn, str->data); ... } ~~~ IMO, it's not good to fool people into thinking something has happened when in fact nothing happened at all. I think the logging of this tool should be much more explicit wrt when it is/isn't in dry-run mode. Perhaps like this: NORMAL pg_log_info("creating publication \"%s\" in database \"%s\"", ...) DRY-RUN pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...) ~~~ Thoughts? ====== [1] https://www.postgresql.org/docs/current/app-pgcreatesubscriber.html Kind Regards, Peter Smith. Fujitsu Australia
Dear Peter, > IMO, it's not good to fool people into thinking something has happened > when in fact nothing happened at all. I think the logging of this tool > should be much more explicit wrt when it is/isn't in dry-run mode. > Perhaps like this: > > NORMAL > pg_log_info("creating publication \"%s\" in database \"%s\"", ...) > > DRY-RUN > pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...) Per my understanding, almost all the output must be adjusted based on the mode, right? I feel it introduces a burden. Can we solve the issue if we print additional message at the beginning if the command runs with dry-run mode? Best regards, Hayato Kuroda FUJITSU LIMITED
On Wed, Oct 1, 2025 at 1:09 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Peter, > > > IMO, it's not good to fool people into thinking something has happened > > when in fact nothing happened at all. I think the logging of this tool > > should be much more explicit wrt when it is/isn't in dry-run mode. > > Perhaps like this: > > > > NORMAL > > pg_log_info("creating publication \"%s\" in database \"%s\"", ...) > > > > DRY-RUN > > pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...) > > Per my understanding, almost all the output must be adjusted based on the mode, right? > I feel it introduces a burden. > Can we solve the issue if we print additional message at the beginning if the > command runs with dry-run mode? > Hi Kuroda-san, Yes, that is one way. Something is better than nothing, at least... I think that not *everything* in dry mode is fake; some of the logged operations are real. So, it might be good if we can show fake ones differently. e.g. It may not take much effort to introduce a wrapper that inserts a prefix. Use this as a replacement for just a few specific info logs. (code below may not work; it's just for illustrative purposes) #define pg_log_info_checkdry(...) do {\ if (dry_run)\ pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\ else;\ pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\ } while (0); ====== Kind Regards, Peter Smith. Fujitsu Australia
On 2025-Oct-01, Peter Smith wrote: > (code below may not work; it's just for illustrative purposes) > > #define pg_log_info_checkdry(...) do {\ > if (dry_run)\ > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\ > else;\ > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\ > } while (0); I like this kind of idea best. However I think it might be better to do it the other way around: have the normal pg_log_info() check dry_run, and have a special one for the messages that are to be identical in either mode. I'm not sure how difficult this is to implement, though. pg_subscriber is not the only program with a dry-run mode; it looks like pg_archiveclean, pg_combinebackup, pg_resetwal, pg_rewind have one. Is it worth maybe doing something at the common/logging.c level rather than specifically pg_createsubscriber? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ […] indem ich in meinem Leben oft an euch gedacht, euch glücklich zu machen. Seyd es! A menudo he pensado en vosotros, en haceros felices. ¡Sedlo, pues! Heiligenstädter Testament, L. v. Beethoven, 1802 https://de.wikisource.org/wiki/Heiligenstädter_Testament
On Wed, Oct 1, 2025 at 8:37 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2025-Oct-01, Peter Smith wrote: > > > (code below may not work; it's just for illustrative purposes) > > > > #define pg_log_info_checkdry(...) do {\ > > if (dry_run)\ > > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\ > > else;\ > > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\ > > } while (0); > > I like this kind of idea best. However I think it might be better to do > it the other way around: have the normal pg_log_info() check dry_run, > and have a special one for the messages that are to be identical in > either mode. I'm not sure how difficult this is to implement, though. > > pg_subscriber is not the only program with a dry-run mode; it looks like > pg_archiveclean, pg_combinebackup, pg_resetwal, pg_rewind have one. Is > it worth maybe doing something at the common/logging.c level rather than > specifically pg_createsubscriber? > Hi Alvaro. Thanks for the feedback and for pointing out the other tools that also have a dr-run mode. I did some quick back-of-the-envelope calculations to see what might be involved. ====== For pg_create_subscriber, there are 38 info logs. Of those 38, I'd be tempted to change only logs that say they are modifying/executing something. There's ~15 of those: pg_log_info("modifying system identifier of subscriber"); pg_log_info("running pg_resetwal on the subscriber"); pg_log_info("subscriber successfully changed the system identifier"); pg_log_info("use existing publication \"%s\" in database \"%s\"", pg_log_info("create publication \"%s\" in database \"%s\"", pg_log_info("create replication slot \"%s\" on publisher", pg_log_info("dropping subscription \"%s\" in database \"%s\"", pg_log_info("creating the replication slot \"%s\" in database \"%s\"", pg_log_info("dropping the replication slot \"%s\" in database \"%s\"", pg_log_info("creating publication \"%s\" in database \"%s\"", pg_log_info("dropping publication \"%s\" in database \"%s\"", pg_log_info("dropping all existing publications in database \"%s\"", pg_log_info("creating subscription \"%s\" in database \"%s\"", pg_log_info("setting the replication progress (node name \"%s\", LSN %s) in database \"%s\"", pg_log_info("enabling subscription \"%s\" in database \"%s\"", ====== For pg_archiveclean: There only seems to be one logging for this dry run. Currently, it is using debug logging like: if (dry_run) pg_log_debug("would do XXX") else pg_log_debug("doing XXX"); ====== For pg_combinebackup: This is the same as above; the differences for dry run logging are already handled by having separate messages, and it uses debug logging in many places with this pattern: if (dry_run) pg_log_debug("would do XXX") else pg_log_debug("doing XXX"); ====== For pg_resetwal: This one is using a 'noupdate' flag instead of a 'dry_run' boolean. And there doesn't seem to be any logging for this. ====== For pg_rewind: Has 13 logs. I'd be tempted to change maybe only a couple of these: pg_rewind/pg_rewind.c: pg_log_info("creating backup label and updating control file"); pg_rewind/pg_rewind.c: pg_log_info("executing \"%s\" for target server to complete crash recovery", ====== I also found dry-run logs using the "would do XXX" pattern elsewhere, in code like contrib/vacuumIo.c ////// Summary The idea to change the pg_log_info macro globally seems risky. There are 400+ usages of this in the PG code, way beyond the scope of these few tools that have a dry-run. It seems that all the other --dry-run capable tools are using the pattern if (dry_run) pg_log_debug("would do XXX") else pg_log_debug("doing XXX"); So, that makes pg_createsubscriber the odd man out. Instead of introducing a new logging macro, perhaps it's better (for code consistency) just to change pg_createsubscriber to use that same logging pattern. ~~~ Kurdoa-san [1] was concerned it might be a big change burden to change the pg_createsubscriber logs, but AFAICT there are only ~15/38 places that I'd be tempted to change in pg_createsubscriber.c; that's not really such a burden. OTOH, I do think Kuroda-san's idea [1] of giving some up-front logging to indicate --dry-run might be useful. Even when run-time gives different log messages, I think those other tools only show differences when using DEBUG logging. Anybody not debugging might otherwise have no idea that nothing actually happened. ~~ So, below is now my favoured solution: 1. Add an up-front info log to say when running in dry-run (add for all tools that have --dry-run mode) 2. Modify ~15 places in pg_createsubscriber.c to use the code pattern consistent with all the other tools. if (dry_run) pg_log_info("would do XXX") else pg_log_info("doing XXX"); Thoughts? ====== [1] https://www.postgresql.org/message-id/OSCPR01MB149668DD062C1457FFAA44A28F5E6A%40OSCPR01MB14966.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
Hello Peter,
Your suggestion makes sense to me, if it's fine with you then I can submit a patch for this change :).
[
So, below is now my favoured solution:
1. Add an up-front info log to say when running in dry-run (add for
all tools that have --dry-run mode)
2. Modify ~15 places in pg_createsubscriber.c to use the code pattern
consistent with all the other tools.
if (dry_run)
pg_log_info("would do XXX")
else
pg_log_info("doing XXX");
1. Add an up-front info log to say when running in dry-run (add for
all tools that have --dry-run mode)
2. Modify ~15 places in pg_createsubscriber.c to use the code pattern
consistent with all the other tools.
if (dry_run)
pg_log_info("would do XXX")
else
pg_log_info("doing XXX");
]
Thanks,
Neeraj Shah
Nutanix Inc.
Nutanix Inc.
On Fri, Oct 3, 2025 at 5:34 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Oct 1, 2025 at 8:37 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Oct-01, Peter Smith wrote:
>
> > (code below may not work; it's just for illustrative purposes)
> >
> > #define pg_log_info_checkdry(...) do {\
> > if (dry_run)\
> > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\
> > else;\
> > pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\
> > } while (0);
>
> I like this kind of idea best. However I think it might be better to do
> it the other way around: have the normal pg_log_info() check dry_run,
> and have a special one for the messages that are to be identical in
> either mode. I'm not sure how difficult this is to implement, though.
>
> pg_subscriber is not the only program with a dry-run mode; it looks like
> pg_archiveclean, pg_combinebackup, pg_resetwal, pg_rewind have one. Is
> it worth maybe doing something at the common/logging.c level rather than
> specifically pg_createsubscriber?
>
Hi Alvaro.
Thanks for the feedback and for pointing out the other tools that also
have a dr-run mode.
I did some quick back-of-the-envelope calculations to see what might
be involved.
======
For pg_create_subscriber, there are 38 info logs. Of those 38, I'd be
tempted to change only logs that say they are modifying/executing
something. There's ~15 of those:
pg_log_info("modifying system identifier of subscriber");
pg_log_info("running pg_resetwal on the subscriber");
pg_log_info("subscriber successfully changed the system identifier");
pg_log_info("use existing publication \"%s\" in database \"%s\"",
pg_log_info("create publication \"%s\" in database \"%s\"",
pg_log_info("create replication slot \"%s\" on publisher",
pg_log_info("dropping subscription \"%s\" in database \"%s\"",
pg_log_info("creating the replication slot \"%s\" in database \"%s\"",
pg_log_info("dropping the replication slot \"%s\" in database \"%s\"",
pg_log_info("creating publication \"%s\" in database \"%s\"",
pg_log_info("dropping publication \"%s\" in database \"%s\"",
pg_log_info("dropping all existing publications in database \"%s\"",
pg_log_info("creating subscription \"%s\" in database \"%s\"",
pg_log_info("setting the replication progress (node name \"%s\", LSN
%s) in database \"%s\"",
pg_log_info("enabling subscription \"%s\" in database \"%s\"",
======
For pg_archiveclean:
There only seems to be one logging for this dry run. Currently, it is
using debug logging like:
if (dry_run)
pg_log_debug("would do XXX")
else
pg_log_debug("doing XXX");
======
For pg_combinebackup:
This is the same as above; the differences for dry run logging are
already handled by having separate messages, and it uses debug logging
in many places with this pattern:
if (dry_run)
pg_log_debug("would do XXX")
else
pg_log_debug("doing XXX");
======
For pg_resetwal:
This one is using a 'noupdate' flag instead of a 'dry_run' boolean.
And there doesn't seem to be any logging for this.
======
For pg_rewind:
Has 13 logs. I'd be tempted to change maybe only a couple of these:
pg_rewind/pg_rewind.c: pg_log_info("creating backup label and updating
control file");
pg_rewind/pg_rewind.c: pg_log_info("executing \"%s\" for target server
to complete crash recovery",
======
I also found dry-run logs using the "would do XXX" pattern elsewhere,
in code like contrib/vacuumIo.c
//////
Summary
The idea to change the pg_log_info macro globally seems risky. There
are 400+ usages of this in the PG code, way beyond the scope of these
few tools that have a dry-run.
It seems that all the other --dry-run capable tools are using the pattern
if (dry_run)
pg_log_debug("would do XXX")
else
pg_log_debug("doing XXX");
So, that makes pg_createsubscriber the odd man out. Instead of
introducing a new logging macro, perhaps it's better (for code
consistency) just to change pg_createsubscriber to use that same
logging pattern.
~~~
Kurdoa-san [1] was concerned it might be a big change burden to change
the pg_createsubscriber logs, but AFAICT there are only ~15/38 places
that I'd be tempted to change in pg_createsubscriber.c; that's not
really such a burden.
OTOH, I do think Kuroda-san's idea [1] of giving some up-front logging
to indicate --dry-run might be useful. Even when run-time gives
different log messages, I think those other tools only show
differences when using DEBUG logging. Anybody not debugging might
otherwise have no idea that nothing actually happened.
~~
So, below is now my favoured solution:
1. Add an up-front info log to say when running in dry-run (add for
all tools that have --dry-run mode)
2. Modify ~15 places in pg_createsubscriber.c to use the code pattern
consistent with all the other tools.
if (dry_run)
pg_log_info("would do XXX")
else
pg_log_info("doing XXX");
Thoughts?
======
[1] https://www.postgresql.org/message-id/OSCPR01MB149668DD062C1457FFAA44A28F5E6A%40OSCPR01MB14966.jpnprd01.prod.outlook.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On 2025-Oct-03, Peter Smith wrote: > Summary > > The idea to change the pg_log_info macro globally seems risky. There > are 400+ usages of this in the PG code, way beyond the scope of these > few tools that have a dry-run. Ok. > So, that makes pg_createsubscriber the odd man out. Instead of > introducing a new logging macro, perhaps it's better (for code > consistency) just to change pg_createsubscriber to use that same > logging pattern. Sure, let's go that way. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/