Обсуждение: Re: pgsql: libpq: Grease the protocol by default
[ moving discussion to -hackers ]
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> crake is angry:
>> pg_dumpall: error: connection to server on socket "/home/andrew/bf/root/tmp/buildfarm-onigY7/.s.PGSQL.5670" failed:
FATAL:unsupported frontend protocol 3.9999: server supports 1.0 to 3.0
>> This indicates a bug in either the server being contacted
>> or a proxy handling the connection. Please consider
>> ...
> So while this is kind of a nice result, in that it's correctly
> diagnosing the problem, I need to add max_protocol_version=3.0 to the
> connection strings for these upgrades during beta.
Either that or we decide that it's time to throw 9.2 support
overboard (looks like 9.3 and up are fine).
If it's not hard to add the connection option to this test, let's do
that --- but if it is a problem, I wouldn't shed too many tears over
moving the oldest-old-version goalpost. I think we might not be able
to fix the test without changing buildfarm client script and/or
configuration.
regards, tom lane
On Mon, Feb 23, 2026 at 2:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Either that or we decide that it's time to throw 9.2 support > overboard (looks like 9.3 and up are fine). Well, while I was hacking on a patch I realized that 9.3 (all the way up to 10) is only okay if you're running a sufficiently patched version. PG11 is the first to support negotiation for the whole release line. > If it's not hard to add the connection option to this test, let's do > that --- but if it is a problem, I wouldn't shed too many tears over > moving the oldest-old-version goalpost. I think we might not be able > to fix the test without changing buildfarm client script and/or > configuration. I think Jelte said it well back on the -committers thread: > Why only force max_protocol_version=3.0 for beta? It sounds like > this would also be an issue once we eventually bump the default > version. So a fix belongs in pg_upgrade, IMO, instead of the test. I have a draft passing locally that I should be able to share soon. Thanks, --Jacob
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> On Mon, Feb 23, 2026 at 2:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Either that or we decide that it's time to throw 9.2 support
>> overboard (looks like 9.3 and up are fine).
> Well, while I was hacking on a patch I realized that 9.3 (all the way
> up to 10) is only okay if you're running a sufficiently patched
> version. PG11 is the first to support negotiation for the whole
> release line.
Hmm ... and of course the whole point of this exercise is to be sure
we can pg_upgrade from those out-of-support versions.
> I think Jelte said it well back on the -committers thread:
>> Why only force max_protocol_version=3.0 for beta? It sounds like
>> this would also be an issue once we eventually bump the default
>> version.
> So a fix belongs in pg_upgrade, IMO, instead of the test. I have a
> draft passing locally that I should be able to share soon.
Fair enough.
regards, tom lane
On Mon, Feb 23, 2026 at 3:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > So a fix belongs in pg_upgrade, IMO, instead of the test. I have a > > draft passing locally that I should be able to share soon. > > Fair enough. Something like the attached (tested only against 9.2 so far)? I would plan to backpatch after feature freeze is lifted. --Jacob
Вложения
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> Something like the attached (tested only against 9.2 so far)? I would
> plan to backpatch after feature freeze is lifted.
This does not fix it for me on a local buildfarm instance. I still get
the same failure during 9.2-to-HEAD cross-version upgrade:
==~_~===-=-===~_~== /home/buildfarm/bf-data/fs-upgrade.tester/HEAD/REL9_2_STABLE-dump1.log ==~_~===-=-===~_~==
pg_dumpall: error: connection to server on socket "/home/buildfarm/bf-data/tmp/buildfarm-0fIcMg/.s.PGSQL.5700" failed:
FATAL: unsupported frontend protocol 3.9999: server supports 1.0 to 3.0
This indicates a bug in either the server being contacted
or a proxy handling the connection. Please consider
reporting this to the maintainers of that software.
For more information, including instructions on how to
work around this issue for now, visit
https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNECT-MAX-PROTOCOL-VERSION
It's not obvious to me where the problem lies. I can replicate
the failure by trying to use HEAD's psql to connect to a 9.2
server, but adding -d "max_protocol_version=3.0" makes psql happy,
so why not pg_dumpall?
Also: I was initially baffled why you thought this needs
back-patching, but I guess you have one eye on packagers like
Debian who think they can make older versions use newer libpq.so.
It'd be good to spell out that reasoning in the commit message.
regards, tom lane
On Mon, Feb 23, 2026 at 4:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jacob Champion <jacob.champion@enterprisedb.com> writes: > > Something like the attached (tested only against 9.2 so far)? I would > > plan to backpatch after feature freeze is lifted. (er, should have been "release freeze", though we're clearly in no hurry) > Also: I was initially baffled why you thought this needs > back-patching, but I guess you have one eye on packagers like > Debian who think they can make older versions use newer libpq.so. Right. > It'd be good to spell out that reasoning in the commit message. Okay, will do. > It's not obvious to me where the problem lies. I can replicate > the failure by trying to use HEAD's psql to connect to a 9.2 > server, but adding -d "max_protocol_version=3.0" makes psql happy, > so why not pg_dumpall? Hmmm, looks like the -dump1.log output is actually from *before* pg_upgrade actually runs: https://github.com/PGBuildFarm/client-code/blob/28d7e945cc2a27fecdf4cc685782821ca504db5d/PGBuild/Modules/TestUpgradeXversion.pm#L514 So that will still need to be modified in the buildfarm client, independently of my patch. (I reproduced the error with a bare pg_upgrade invocation and didn't think to look closer; sorry for the confusion.) --Jacob
I wrote:
> It's not obvious to me where the problem lies.
Ah: the step that is failing is where TestUpgradeXVersion.pm
is trying to make a comparison dump from the old server:
# use the NEW pg_dumpall so we're comparing apples with apples.
setinstenv($self, "$installdir", $save_env);
system( qq{"$installdir/bin/pg_dumpall" $dump_opts -p $sport -f }
. qq{"$upgrade_loc/origin-$oversion.sql" }
. qq{> "$upgrade_loc/$oversion-dump1.log" 2>&1});
return if $?;
So I was right to suspect that we can't fix this without modifying
the buildfarm client.
regards, tom lane
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> On Mon, Feb 23, 2026 at 4:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Also: I was initially baffled why you thought this needs
>> back-patching, but I guess you have one eye on packagers like
>> Debian who think they can make older versions use newer libpq.so.
> Right.
Actually, that is going to be harder than you thought, because libpq
before v18 will spit up on connection option "max_protocol_version".
This patch will not work as-is for back-patching unless we care to
also back-patch the addition of that option, which I'd be inclined
to resist.
Fortunately, we long ago had the foresight to invent PQlibVersion,
so you could make addition of the extra option conditional on
PQlibVersion(conn) >= 180000 in branches before 18.
> Hmmm, looks like the -dump1.log output is actually from *before*
> pg_upgrade actually runs:
Yeah, I came to the same conclusion. I got a clean BF run using
your patch together with the attached patch for the BF client.
(In this patch, I did not worry about scenarios involving old
minor releases. If Andrew is excited about that case he can
extend the version-comparison logic.)
regards, tom lane
--- PGBuild/Modules/TestUpgradeXversion.pm~ 2025-11-25 07:47:25.000000000 -0500
+++ PGBuild/Modules/TestUpgradeXversion.pm 2026-02-23 20:57:31.640149574 -0500
@@ -483,9 +483,17 @@ sub test_upgrade ## no critic (Subrou
$dump_opts .= ' --extra-float-digits=0';
}
+ # with very old servers we must restrict the protocol version.
+ my $maxpversion = "";
+ if ( ($this_branch eq 'HEAD' || $this_branch gt 'REL_18_STABLE')
+ && ($oversion ne 'HEAD' && $oversion le 'REL_9_2_STABLE'))
+ {
+ $maxpversion = '-d max_protocol_version=3.0';
+ }
+
# use the NEW pg_dumpall so we're comparing apples with apples.
setinstenv($self, "$installdir", $save_env);
- system( qq{"$installdir/bin/pg_dumpall" $dump_opts -p $sport -f }
+ system( qq{"$installdir/bin/pg_dumpall" $dump_opts $maxpversion -p $sport -f }
. qq{"$upgrade_loc/origin-$oversion.sql" }
. qq{> "$upgrade_loc/$oversion-dump1.log" 2>&1});
return if $?;
On 2026-02-23 Mo 9:08 PM, Tom Lane wrote:
Jacob Champion <jacob.champion@enterprisedb.com> writes:On Mon, Feb 23, 2026 at 4:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Also: I was initially baffled why you thought this needs back-patching, but I guess you have one eye on packagers like Debian who think they can make older versions use newer libpq.so.Right.Actually, that is going to be harder than you thought, because libpq before v18 will spit up on connection option "max_protocol_version". This patch will not work as-is for back-patching unless we care to also back-patch the addition of that option, which I'd be inclined to resist. Fortunately, we long ago had the foresight to invent PQlibVersion, so you could make addition of the extra option conditional on PQlibVersion(conn) >= 180000 in branches before 18.Hmmm, looks like the -dump1.log output is actually from *before* pg_upgrade actually runs:Yeah, I came to the same conclusion. I got a clean BF run using your patch together with the attached patch for the BF client. (In this patch, I did not worry about scenarios involving old minor releases. If Andrew is excited about that case he can extend the version-comparison logic.)
I am not worried about old minor releases. I am currently testing a patch with similar intent to yours.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On Mon, Feb 23, 2026 at 06:34:42PM -0500, Tom Lane wrote: > Jacob Champion <jacob.champion@enterprisedb.com> writes: >> On Mon, Feb 23, 2026 at 2:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Either that or we decide that it's time to throw 9.2 support >>> overboard (looks like 9.3 and up are fine). > >> Well, while I was hacking on a patch I realized that 9.3 (all the way >> up to 10) is only okay if you're running a sufficiently patched >> version. PG11 is the first to support negotiation for the whole >> release line. > > Hmm ... and of course the whole point of this exercise is to be sure > we can pg_upgrade from those out-of-support versions. I discussed this a bit on the hacking Discord last year, but IMHO we're reaching a good point to bump up pg_upgrade's oldest supported major version. I'll probably push to bump it to v10 for the v20 release so that we can remove many of the version-specific hacks we've built up over the years. Not to mention that the cross-version tests (the "export oldinstall" ones described in pg_upgrade's TESTING file, not the buildfarm ones) don't seem to work past v10 or so because they use various options that didn't exist or have since been renamed. Granted, this probably doesn't help the present issue... -- nathan
On Mon, Feb 23, 2026 at 6:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jacob Champion <jacob.champion@enterprisedb.com> writes: > > On Mon, Feb 23, 2026 at 4:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Also: I was initially baffled why you thought this needs > >> back-patching, but I guess you have one eye on packagers like > >> Debian who think they can make older versions use newer libpq.so. > > > Right. > > Actually, that is going to be harder than you thought, because libpq > before v18 will spit up on connection option "max_protocol_version". Ha, right. Luckily the failure is very loud when testing :) > Fortunately, we long ago had the foresight to invent PQlibVersion, > so you could make addition of the extra option conditional on > PQlibVersion(conn) >= 180000 in branches before 18. Attached is a sample backport for REL_14_STABLE, using that strategy. Tested with pg_upgrade 9.2-to-14, when linked against both 14.22 and HEAD versions of libpq. I still need to run a sanity check with the other 9.x lines to make sure I've selected the right cutoffs. > Yeah, I came to the same conclusion. I got a clean BF run using > your patch together with the attached patch for the BF client. Nice, thanks! --Jacob
Вложения
On Tue, Feb 24, 2026 at 8:17 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > Granted, this probably doesn't help the present issue... Yeah, we'd need to go all the way to PG11 to avoid it entirely. But that's okay -- finding and fixing this now means that we don't have to relitigate it when bumping the default version later (or when releasing production-grade grease). If PG10 ages out before we finally decide to do either one, fine. --Jacob
On Tue, Feb 24, 2026 at 9:18 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > I still need to run a sanity check with the > other 9.x lines to make sure I've selected the right cutoffs. The cutoffs don't behave the way I thought they would. Yesterday, I was about to complain that cluster.major_version was poorly named -- why call it that if you have to pass it through GET_MAJOR_VERSION() to get at what you want? -- but it does in fact contain _only_ the major version information, because that's all that PG_VERSION tells us. And unfortunately we don't save the result of the version check for the old postgres binary anywhere. So pg_upgrade will use max_protocol_version=3.0 with all servers v10 and below, in practice. There's nothing wrong with that behavior, but I think I should switch to a simple `< 1100` check in the code to avoid misleading people, unless anyone has a better way that won't significantly increase the cost of the backport. (I could potentially follow up with an improvement on HEAD, if the cost-benefit makes sense, but I'm not sure it does.) --Jacob
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> So pg_upgrade will use max_protocol_version=3.0 with all servers v10
> and below, in practice. There's nothing wrong with that behavior, but
> I think I should switch to a simple `< 1100` check in the code to
> avoid misleading people, unless anyone has a better way that won't
> significantly increase the cost of the backport.
Simple is good here. I don't think we'd buy much by distinguishing
minor versions of already-EOL servers. Also, I suspect that
pg_upgrade can't get the minor version except by starting the server
and asking it --- we don't record anything but major version on-disk.
regards, tom lane
On Tue, Feb 24, 2026 at 10:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simple is good here. I don't think we'd buy much by distinguishing > minor versions of already-EOL servers. Also, I suspect that > pg_upgrade can't get the minor version except by starting the server > and asking it --- we don't record anything but major version on-disk. We do store (the major version part of) `pg_ctl --version` at some point, which we could improve upon, but I agree that adding additional complexity here doesn't actually give us any benefit. --Jacob
On Tue, Feb 24, 2026 at 10:58 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > We do store (the major version part of) `pg_ctl --version` at some > point, which we could improve upon, but I agree that adding additional > complexity here doesn't actually give us any benefit. Okay, here are the patches I propose. The ones for 18-HEAD check for PG10 and below, and 14-17 additionally check that libpq >= 18, before adding max_protocol_version=3.0. I smoke-tested upgrades from 9.2, 10.1, and 11.0 for all six branches, and then re-ran that test matrix with pg_upgrade linked against a greased libpq. --Jacob
Вложения
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.patch
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.18.patch
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.17.patch
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.16.patch
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.15.patch
- v2-0001-pg_upgrade-Use-max_protocol_version-3.0-for-older.14.patch
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> Okay, here are the patches I propose. The ones for 18-HEAD check for
> PG10 and below, and 14-17 additionally check that libpq >= 18, before
> adding max_protocol_version=3.0.
These look sane to me. I might suggest a small wording change in
the <= 17 patches:
- * Back-branch-specific complication: for libpq versions prior to PG18,
+ * Back-branch-specific complication: in libpq versions prior to PG18,
but it probably isn't worth the trouble to edit four separate patches.
regards, tom lane
On Tue, Feb 24, 2026 at 12:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > These look sane to me. I might suggest a small wording change in > the <= 17 patches: > > - * Back-branch-specific complication: for libpq versions prior to PG18, > + * Back-branch-specific complication: in libpq versions prior to PG18, > > but it probably isn't worth the trouble to edit four separate patches. By the power of sed, it is done. And I see that the new release tags are here, so pushed. Thanks for all the review and testing! --Jacob
On 2026-02-24 Tu 10:55 AM, Andrew Dunstan wrote: > > > On 2026-02-23 Mo 9:08 PM, Tom Lane wrote: >> Jacob Champion<jacob.champion@enterprisedb.com> writes: >>> On Mon, Feb 23, 2026 at 4:45 PM Tom Lane<tgl@sss.pgh.pa.us> wrote: >>>> Also: I was initially baffled why you thought this needs >>>> back-patching, but I guess you have one eye on packagers like >>>> Debian who think they can make older versions use newer libpq.so. >>> Right. >> Actually, that is going to be harder than you thought, because libpq >> before v18 will spit up on connection option "max_protocol_version". >> This patch will not work as-is for back-patching unless we care to >> also back-patch the addition of that option, which I'd be inclined >> to resist. >> >> Fortunately, we long ago had the foresight to invent PQlibVersion, >> so you could make addition of the extra option conditional on >> PQlibVersion(conn) >= 180000 in branches before 18. >> >>> Hmmm, looks like the -dump1.log output is actually from *before* >>> pg_upgrade actually runs: >> Yeah, I came to the same conclusion. I got a clean BF run using >> your patch together with the attached patch for the BF client. >> (In this patch, I did not worry about scenarios involving old >> minor releases. If Andrew is excited about that case he can >> extend the version-comparison logic.) >> >> > > > I am not worried about old minor releases. I am currently testing a > patch with similar intent to yours. > > Here's what worked for me, even before Jacob's patch of 15 minutes or so ago. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Вложения
On Tue, Feb 24, 2026 at 2:49 PM Andrew Dunstan <andrew@dunslane.net> wrote: > Here's what worked for me, even before Jacob's patch of 15 minutes or so > ago. Oh, the envvar is clever. You'll probably want to do that only for the pg_dumpall invocation now that pg_upgrade is patched, though, so we don't cover up regressions. --Jacob
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> On Tue, Feb 24, 2026 at 2:49 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> Here's what worked for me, even before Jacob's patch of 15 minutes or so
>> ago.
> Oh, the envvar is clever. You'll probably want to do that only for the
> pg_dumpall invocation now that pg_upgrade is patched, though, so we
> don't cover up regressions.
I agree with Jacob on both points.
regards, tom lane
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> Oh, the envvar is clever. You'll probably want to do that only for the
> pg_dumpall invocation now that pg_upgrade is patched, though, so we
> don't cover up regressions.
I can confirm clean x-version tests on all branches with git tip
and this:
--- TestUpgradeXversion.pm.orig 2025-11-25 07:47:25.000000000 -0500
+++ TestUpgradeXversion.pm 2026-02-24 18:50:29.487530840 -0500
@@ -485,10 +485,14 @@ sub test_upgrade ## no critic (Subrou
# use the NEW pg_dumpall so we're comparing apples with apples.
setinstenv($self, "$installdir", $save_env);
+ local $ENV{PGMAXPROTOCOLVERSION} =
+ ($oversion le 'REL9_2_STABLE') ? "3.0" : "latest";
+
system( qq{"$installdir/bin/pg_dumpall" $dump_opts -p $sport -f }
. qq{"$upgrade_loc/origin-$oversion.sql" }
. qq{> "$upgrade_loc/$oversion-dump1.log" 2>&1});
return if $?;
+ delete $ENV{PGMAXPROTOCOLVERSION};
setinstenv($self, "$other_branch/inst", $save_env);
system( qq{"$other_branch/inst/bin/pg_ctl" -D }
which is Andrew's patch but with the envvar dropped as soon
as possible.
regards, tom lane
On 2026-02-24 Tu 8:30 PM, Tom Lane wrote:
> Jacob Champion <jacob.champion@enterprisedb.com> writes:
>> Oh, the envvar is clever. You'll probably want to do that only for the
>> pg_dumpall invocation now that pg_upgrade is patched, though, so we
>> don't cover up regressions.
> I can confirm clean x-version tests on all branches with git tip
> and this:
>
> --- TestUpgradeXversion.pm.orig 2025-11-25 07:47:25.000000000 -0500
> +++ TestUpgradeXversion.pm 2026-02-24 18:50:29.487530840 -0500
> @@ -485,10 +485,14 @@ sub test_upgrade ## no critic (Subrou
>
> # use the NEW pg_dumpall so we're comparing apples with apples.
> setinstenv($self, "$installdir", $save_env);
> + local $ENV{PGMAXPROTOCOLVERSION} =
> + ($oversion le 'REL9_2_STABLE') ? "3.0" : "latest";
> +
> system( qq{"$installdir/bin/pg_dumpall" $dump_opts -p $sport -f }
> . qq{"$upgrade_loc/origin-$oversion.sql" }
> . qq{> "$upgrade_loc/$oversion-dump1.log" 2>&1});
> return if $?;
> + delete $ENV{PGMAXPROTOCOLVERSION};
> setinstenv($self, "$other_branch/inst", $save_env);
>
> system( qq{"$other_branch/inst/bin/pg_ctl" -D }
>
> which is Andrew's patch but with the envvar dropped as soon
> as possible.
>
>
Yep. working for me too.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 2026-Feb-24, Jacob Champion wrote: > On Tue, Feb 24, 2026 at 12:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > These look sane to me. I might suggest a small wording change in > > the <= 17 patches: > > > > - * Back-branch-specific complication: for libpq versions prior to PG18, > > + * Back-branch-specific complication: in libpq versions prior to PG18, > > > > but it probably isn't worth the trouble to edit four separate patches. > > By the power of sed, it is done. What I normally do for this kind of thing is commit the typo fix in a separate commit, then cherry-pick that to other branches, then use rebase -i to squash it as a fix-up of the previous commit on all branches. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Y una voz del caos me habló y me dijo "Sonríe y sé feliz, podría ser peor". Y sonreí. Y fui feliz. Y fue peor.