Обсуждение: pg_upgradecluster and synchronous replication
I ran into a problem today during a training session when I demonstrated pg_upgradecluster -v 18 --method=upgrade --link 17 main The problem was that I had configured synchronous streaming replication first, so after pg_upgrade, the synchronous standby was lost, and when pg_upgradecluster ran vacuumdb ... --analyze-in-stages --missing-stats-only processing hung, because the synchronous standby was not available and the transaction could not commit. Yes, that was my mistake, and yes, the problem was easily fixed by interrupting pg_upgradecluster, disabling synchronous_standby_names and running the ANALYZE and v17 cluster removal manually. Yet I wonder if pg_upgradecluster could improve things by disabling synchronous_standby_names when the cluster ist started for the "finish" stage. I have attached a POC patch how this could be done. I didn't test it, and my Perl skills are marginal, but you get the idea. Perhaps this is too much black magic, not sure. But I wanted to share my experience. Yours, Laurenz Albe
Вложения
Re: Laurenz Albe
> vacuumdb ... --analyze-in-stages --missing-stats-only
>
> processing hung, because the synchronous standby was not available and the transaction
> could not commit.
Interesting find, thanks for sharing.
> Yet I wonder if pg_upgradecluster could improve things by disabling synchronous_standby_names
> when the cluster ist started for the "finish" stage. I have attached a POC patch how this
> could be done. I didn't test it, and my Perl skills are marginal, but you get the idea.
That's also the command that does the final cluster start for
production, I wouldn't want such a change to persist.
Maybe we should instead change the analyze hook script to do that
internally? Setting PGOPTIONS should be enough:
PGOPTIONS="-csynchronous_commit=local"
> Perhaps this is too much black magic, not sure. But I wanted to share my experience.
There is already some code in pg_upgradecluster that works around
black magic problems:
# ensure we can upgrade DBs with default read-only transactions
$ENV{PGOPTIONS} .= " -cdefault_transaction_read_only=off";
This would just add one more wart of that kind.
Christoph
On Wed, 2026-01-14 at 20:39 +0100, Christoph Berg wrote:
> > vacuumdb ... --analyze-in-stages --missing-stats-only
> >
> > processing hung, because the synchronous standby was not available and the transaction
> > could not commit.
> >
> > I wonder if pg_upgradecluster could improve things by disabling synchronous_standby_names
> > when the cluster ist started for the "finish" stage. I have attached a POC patch how this
> > could be done. I didn't test it, and my Perl skills are marginal, but you get the idea.
>
> Maybe we should instead change the analyze hook script to do that
> internally? Setting PGOPTIONS should be enough:
>
> PGOPTIONS="-csynchronous_commit=local"
>
> > Perhaps this is too much black magic, not sure. But I wanted to share my experience.
>
> There is already some code in pg_upgradecluster that works around
> black magic problems:
>
> # ensure we can upgrade DBs with default read-only transactions
> $ENV{PGOPTIONS} .= " -cdefault_transaction_read_only=off";
>
> This would just add one more wart of that kind.
That looks like a better way to do it, I agree.
> That's also the command that does the final cluster start for
> production, I wouldn't want such a change to persist.
That makes me wonder. I don't think that it is a great idea to start the cluster for
production with "synchronous_commit" set to a non-standard value. That would mask the
problem initially, but whenever the cluster is restarted, the parameter is back to its
original value, and suddenly things will stop working.
I can imagine two ways to do better:
1. Set "synchronous_commit=local" during the ANALYZE, then restart the cluster without
overriding any parameters. Then the upgrade itself would work, and the problem with
the missing synchronous standby would manifest right when the user starts using the
upgraded cluster. Better than having things start failing after the next restart,
perhaps months later, when it is harder to trace the problem to its original cause!
2. Add a pre-upgrade check that just aborts with a useful error message if
"synchronous_standby_names" is set.
Yours,
Laurenz Albe
On Wed, 2026-01-14 at 20:54 +0100, I wrote:
> On Wed, 2026-01-14 at 20:39 +0100, Christoph Berg wrote:
> >
> > Maybe we should instead change the analyze hook script to do that
> > internally? Setting PGOPTIONS should be enough:
> >
> > PGOPTIONS="-csynchronous_commit=local"
> >
> >
> > There is already some code in pg_upgradecluster that works around
> > black magic problems:
> >
> > # ensure we can upgrade DBs with default read-only transactions
> > $ENV{PGOPTIONS} .= " -cdefault_transaction_read_only=off";
> >
> > This would just add one more wart of that kind.
>
> That looks like a better way to do it, I agree.
>
>
> at makes me wonder. I don't think that it is a great idea to start the cluster for
> production with "synchronous_commit" set to a non-standard value. That would mask the
> problem initially, but whenever the cluster is restarted, the parameter is back to its
> original value, and suddenly things will stop working.
Forget that: I missed that with PGOPTIONS, the setting is only changed for
the database session that performs the ANALYZE, not for the entire server
runtime. Yes, I think that is the proper solution!
Yours,
Laurenz Albe
On Thu, 2026-01-15 at 05:40 +0100, I wrote: > > On Wed, 2026-01-14 at 20:39 +0100, Christoph Berg wrote: > > > > > > Maybe we should instead change the analyze hook script to do that > > > internally? Setting PGOPTIONS should be enough: > > > > > > PGOPTIONS="-csynchronous_commit=local" > > Yes, I think that is the proper solution! How about the attached patch? Yours, Laurenz Albe
Вложения
Re: Laurenz Albe
> How about the attached patch?
Thanks, pushed with a minor update - with set -e, we need to use
${PGOPTIONS:-} instead of $PGOPTIONS.
Christoph
On Mon, 2026-01-19 at 15:38 +0100, Christoph Berg wrote:
> Re: Laurenz Albe
> > How about the attached patch?
>
> Thanks, pushed with a minor update - with set -e, we need to use
> ${PGOPTIONS:-} instead of $PGOPTIONS.
Thank you!
Yours,
Laurenz Albe