Обсуждение: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
Hi Team,
We are facing an issue related to index corruption after migrating our PostgreSQL 11 setup from RHEL 7 to RHEL 9 using streaming replication (base backup method).
After bringing up the standby on RHEL 9, we observed that certain tables are not returning results when queries use indexed scans. Upon investigation, the following check confirms index corruption:
The same indexes work fine on the RHEL 7 (primary) side. However, on the RHEL 9 replica, queries that rely on this index return zero records.Rebuilding the indexes fixed the issue temporarily but we have many indexes and our DB size is more than 10TB.
Environment details:
PostgreSQL Version: 11.15
OS on primary: RHEL 7.9
OS on standby: RHEL 9.6
Replication Type: Streaming replication (initialized using
pg_basebackup)Data Directory initialized from RHEL 7 base backup
Issue Summary:
Indexes appear and are the same size as per prod on standby after base backup restore.
We are able to read the data from the tables without index scans on standy by RHEL 9.
No filesystem or WAL errors observed in logs.
Could this be related to OS-level binary or page layout differences between RHEL 7 and RHEL 9 for PostgreSQL 11 binaries?
Any insights or recommended actions would be greatly appreciated.
Thanks & Regards,
krishna.
[RHEL] Version 8 uses the new locale data. Therefore, caution will be necessary when upgrading."
Hi Team,
We are facing an issue related to index corruption after migrating our PostgreSQL 11 setup from RHEL 7 to RHEL 9 using streaming replication (base backup method).
After bringing up the standby on RHEL 9, we observed that certain tables are not returning results when queries use indexed scans. Upon investigation, the following check confirms index corruption:
The same indexes work fine on the RHEL 7 (primary) side. However, on the RHEL 9 replica, queries that rely on this index return zero records.Rebuilding the indexes fixed the issue temporarily but we have many indexes and our DB size is more than 10TB.Environment details:
PostgreSQL Version: 11.15
OS on primary: RHEL 7.9
OS on standby: RHEL 9.6
Replication Type: Streaming replication (initialized using
pg_basebackup)Data Directory initialized from RHEL 7 base backup
Issue Summary:
Indexes appear and are the same size as per prod on standby after base backup restore.
We are able to read the data from the tables without index scans on standy by RHEL 9.
No filesystem or WAL errors observed in logs.
Could this be related to OS-level binary or page layout differences between RHEL 7 and RHEL 9 for PostgreSQL 11 binaries?
Any insights or recommended actions would be greatly appreciated.
Thanks & Regards,
krishna.
On 10/22/25 02:03, Bala M wrote: > Hi Team, > > We are facing an issue related to index corruption after migrating our > PostgreSQL 11 setup from *RHEL 7* to *RHEL 9* using *streaming > replication* (base backup method). Postgres 11 is two years past EOL per: https://www.postgresql.org/support/versioning/ > > Thanks & Regards, > *krishna.* > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
PostgreSQL Version: 11.15
On Wed, Oct 22, 2025 at 6:53 AM Bala M <krishna.pgdba@gmail.com> wrote:
PostgreSQL Version: 11.15
Ouch! Not only is the major version end of life, but that's not even the latest revision of 11.
On 10/22/25 11:18, Ron Johnson wrote: > On Wed, Oct 22, 2025 at 1:35 PM Greg Sabino Mullane <htamfids@gmail.com > <mailto:htamfids@gmail.com>> wrote: > > On Wed, Oct 22, 2025 at 6:53 AM Bala M <krishna.pgdba@gmail.com > <mailto:krishna.pgdba@gmail.com>> wrote: > > * > > PostgreSQL Version: 11.15 > > > Ouch! Not only is the major version end of life, but that's not even > the latest revision of 11. > > Do pre-built PG11 binaries even exist for RHEL9? I believe you can get them here: https://yum.postgresql.org/repopackages/#archivedrepos I don't use RH do I can't verify. > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> lobster! -- Adrian Klaver adrian.klaver@aklaver.com
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
Acceptable downtime: ~1 day
Logical replication: Not feasible due to the number of schemas, tables, and overall data volume
* For a 15 TB database with roughly 1 day downtime, what would be the most reliable approach to migrate from RHEL 7 → RHEL 9 while avoiding collation/index corruption issues?
* Would using pg_upgrade (with--checkand--cloneoptions) be safe when moving between OS versions with different glibc libraries?
* If we temporarily remain on PostgreSQL 11, is it mandatory to rebuild all indexes after restoring the base backup on RHEL 9 to ensure data consistency? Would running REINDEX DATABASE across all databases be sufficient?* Are there any community-tested procedures or best practices for migrating large (15 TB+) environments between RHEL 7 and RHEL 9 with minimal downtime?
On 10/23/25 07:51, Bala M wrote: > Hi All, > > Thank you for the valuable responses regarding the *index corruption > issue* we observed during our test migration of a *PostgreSQL 11* > environment from *RHEL 7 to RHEL 9* using *streaming replication*. > > Based on the replies, I understand that. > Suggested upgrading to a newer PostgreSQL version (e.g., v16 or v18) > and using *logical replication* or *pg_dump/restore* for a clean > rebuild. I am not sure version 18 would a good choice at this time, it has just been released and has no bug fix releases against it yet. Given the other complications in your upgrade, OS upgrade and multi-version jump, I don't think you want to add a brand new version to the mix. Postgres 17 on the other hand has been out a year and has had 6 bug/security fixes. It is probably a more stable target version. -- Adrian Klaver adrian.klaver@aklaver.com
Acceptable downtime: ~1 day
Logical replication: Not feasible due to the number of schemas, tables, and overall data volume
I'm not sure why this is not feasible. Can you expand on this?* For a 15 TB database with roughly 1 day downtime, what would be the most reliable approach to migrate from RHEL 7 → RHEL 9 while avoiding collation/index corruption issues?pg_dump is the most reliable, and the slowest. Keep in mind that only the actual data needs to move over (not the indexes, which get rebuilt after the data is loaded). You could also mix-n-match pg_logical and pg_dump if you have a few tables that are super large. Whether either approach fits in your 24 hour window is hard to say without you running some tests.
RemoteThreads: 16
LocalThreads: 24
SharedBuffs: 32 GB
MaintWorkMem: 3 GB
CheckPoint: 30 min
MaxWalSize: 36 GB
WalBuffs: 128 MB
This happens because of the glibc version differrence between RHEL X and RHEL Y. At this point you either have to rebuild all indexes (sorry!) or redo the upgrade via logical replication (if it works for your app's behaviour)
Devrim
Hi Team,
We are facing an issue related to index corruption after migrating our PostgreSQL 11 setup from RHEL 7 to RHEL 9 using streaming replication (base backup method).
After bringing up the standby on RHEL 9, we observed that certain tables are not returning results when queries use indexed scans. Upon investigation, the following check confirms index corruption:
The same indexes work fine on the RHEL 7 (primary) side. However, on the RHEL 9 replica, queries that rely on this index return zero records.Rebuilding the indexes fixed the issue temporarily but we have many indexes and our DB size is more than 10TB.Environment details:
PostgreSQL Version: 11.15
OS on primary: RHEL 7.9
OS on standby: RHEL 9.6
Replication Type: Streaming replication (initialized using
pg_basebackup)Data Directory initialized from RHEL 7 base backup
Issue Summary:
Indexes appear and are the same size as per prod on standby after base backup restore.
We are able to read the data from the tables without index scans on standy by RHEL 9.
No filesystem or WAL errors observed in logs.
Could this be related to OS-level binary or page layout differences between RHEL 7 and RHEL 9 for PostgreSQL 11 binaries?
Any insights or recommended actions would be greatly appreciated.
Thanks & Regards,
krishna.
On Fri, 24 Oct 2025 at 07:17, Devrim Gündüz <devrim@gunduz.org> wrote: > This happens because of the glibc version differrence between RHEL X and RHEL Y. At this point you either have to rebuildall indexes (sorry!) or redo the upgrade via logical replication (if it works for your app's behaviour) There was a mention in [1] about a wiki page which includes a query to identify which indexes could be affected. While that *could* be all indexes, it does seem unlikely. David [1] https://postgr.es/m/CAOg7f80JJCMHXhNLUjnHMwBFV-f9Msv4L=PML0CdaZ4n47uEeA@mail.gmail.com
Just wanted to clarify, this failed because OP used streaming which may not work between OS versions? But logical would have been just fine, between OS and PG versions or even to Windows? I always assumed streaming would “just work” as long as it’s the same major PG version and Linux-to-Linux regardless of OS/glibc version. That’s an awesome piece of design work BTW… congratulations and thank you for that. It never occurred to me that there could be an OS influencing factor like the glibc version for streaming replication.
I almost took the same streaming path about two months ago to go from RH7 to RH8; instead opted to take some downtime overnight and do a dump/restore to go from RH7 to RH8 because logical replication was not an option and it seemed safer. Sounds like I chose the better path by chance.
Thank you for sharing your knowledge.
Scot Kreienkamp | Applications Infrastructure Architect | La-Z-Boy Corporate
(734) 384-6403 | 1-734-915-1444 | Scot.Kreienkamp@la-z-boy.com
One La-Z-Boy Drive | Monroe, Michigan 48162 | la-z-boy.com
facebook.com/lazboy | instagram.com/lazboy | youtube.com/lazboy

From: Devrim Gündüz <devrim@gunduz.org>
Sent: Thursday, October 23, 2025 2:17 PM
To: pgsql-general@lists.postgresql.org; Bala M <krishna.pgdba@gmail.com>
Subject: Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
Hi,
This happens because of the glibc version differrence between RHEL X and RHEL Y. At this point you either have to rebuild all indexes (sorry!) or redo the upgrade via logical replication (if it works for your app's behaviour)
Devrim
On 22 October 2025 12:03:06 EEST, Bala M <krishna.pgdba@gmail.com> wrote:
Hi Team,
We are facing an issue related to index corruption after migrating our PostgreSQL 11 setup from RHEL 7 to RHEL 9 using streaming replication (base backup method).
After bringing up the standby on RHEL 9, we observed that certain tables are not returning results when queries use indexed scans. Upon investigation, the following check confirms index corruption:
The same indexes work fine on the RHEL 7 (primary) side. However, on the RHEL 9 replica, queries that rely on this index return zero records.Rebuilding the indexes fixed the issue temporarily but we have many indexes and our DB size is more than 10TB.Environment details:
· PostgreSQL Version: 11.15
· OS on primary: RHEL 7.9
· OS on standby: RHEL 9.6
· Replication Type: Streaming replication (initialized using
pg_basebackup)· Data Directory initialized from RHEL 7 base backup
Issue Summary:
· Indexes appear and are the same size as per prod on standby after base backup restore.
· We are able to read the data from the tables without index scans on standy by RHEL 9.
· No filesystem or WAL errors observed in logs.
Could this be related to OS-level binary or page layout differences between RHEL 7 and RHEL 9 for PostgreSQL 11 binaries?
Any insights or recommended actions would be greatly appreciated.
Thanks & Regards,
krishna.
This message is intended only for the individual or entity to which it is addressed. It may contain privileged, confidential information which is exempt from disclosure under applicable laws. If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information. If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.
Вложения
I always assumed streaming would “just work” as long as it’s the same major PG version and Linux-to-Linux regardless of OS/glibc version....It never occurred to me that there could be an OS influencing factor like the glibc version for streaming replication.
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
I am not sure version 18 would a good choice at this time, it has just been released and has no bug fix releases against it yet. Given the
other complications in your upgrade, OS upgrade and multi-version jump, I don't think you want to add a brand new version to the mix. Postgres
17 on the other hand has been out a year and has had 6 bug/security fixes. It is probably a more stable target version.
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
Any advice, recommendations, or shared experiences from others who have performed similar migrations would be greatly appreciated.
On Thu, Oct 23, 2025 at 10:51 AM Bala M <krishna.pgdba@gmail.com> wrote:Any advice, recommendations, or shared experiences from others who have performed similar migrations would be greatly appreciated.Some related advice: put some system in place to make sure you are always running the latest revision in whatever major version you end up. See:In particular, note this part:The community considers performing minor upgrades to be less risky than continuing to run an old minor version.
&& sudo yum install -y Pg17.6_RHEL8/*rpm \
&& suip pg_ctl start -wt9999 \
&& psql -Xc "select version()"
On 10/24/25 05:53, Greg Sabino Mullane wrote: > On Thu, Oct 23, 2025 at 11:49 AM Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > > I am not sure version 18 would a good choice at this time, it has > just been released and has no bug fix releases against it yet. Given > the > other complications in your upgrade, OS upgrade and multi-version > jump, I don't think you want to add a brand new version to the mix. > Postgres > 17 on the other hand has been out a year and has had 6 bug/security > fixes. It is probably a more stable target version. > > > I hear your concern, but I'm pretty confident in v18. Additionally, they Which is pretty much the criteria for a GA release, confidence that it is ready to face the general public. That is not the same thing as it being guaranteed bug free. Only time facing the manipulations of said public proves how many bugs there are and how bad they are. > are on version 11 (eleven!) so obviously major upgrades are a rare > thing, so might as well buy themselves another year. :) Major upgrade being the operative phrase here. The OP and company are going to face enough new hoops to jump through in the move from 11, adding a new release to that mix is pushing it a little too far in my estimation. > > Cheers, > Greg > > -- > Crunchy Data - https://www.crunchydata.com <https://www.crunchydata.com> > Enterprise Postgres Software Products & Tech Support > -- Adrian Klaver adrian.klaver@aklaver.com
On 10/24/25 05:53, Greg Sabino Mullane wrote:
> On Thu, Oct 23, 2025 at 11:49 AM Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
> I am not sure version 18 would a good choice at this time, it has
> just been released and has no bug fix releases against it yet. Given
> the
> other complications in your upgrade, OS upgrade and multi-version
> jump, I don't think you want to add a brand new version to the mix.
> Postgres
> 17 on the other hand has been out a year and has had 6 bug/security
> fixes. It is probably a more stable target version.
>
>
> I hear your concern, but I'm pretty confident in v18. Additionally, they
Which is pretty much the criteria for a GA release, confidence that it
is ready to face the general public. That is not the same thing as it
being guaranteed bug free. Only time facing the manipulations of said
public proves how many bugs there are and how bad they are.
On 10/24/25 08:00, Ron Johnson wrote: > On Fri, Oct 24, 2025 at 10:54 AM Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > "Never trust a .0 release with important data" is just as true in 2025 > as it was in 1985. > > That's a chicken and egg problem, though, isn't it? There is nothing stopping you from setting up a test instance and kicking the tires on a new release to see if your setup will work correctly. Though the issue in this case is deciding what part of the 15 TB you would test against or finding the resources to replicate the entire setup. -- Adrian Klaver adrian.klaver@aklaver.com
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
pg_dump is the most reliable, and the slowest. Keep in mind that only the actual data needs to move over (not the indexes, which get rebuilt after the data is loaded). You could also mix-n-match pg_logical and pg_dump if you have a few tables that are super large. Whether either approach fits in your 24 hour window is hard to say without you running some tests.
On Sat, 25 Oct 2025 at 04:51, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 10/24/25 08:00, Ron Johnson wrote: > > On Fri, Oct 24, 2025 at 10:54 AM Adrian Klaver > > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > > > "Never trust a .0 release with important data" is just as true in 2025 > > as it was in 1985. > > > > That's a chicken and egg problem, though, isn't it? > > There is nothing stopping you from setting up a test instance and > kicking the tires on a new release to see if your setup will work > correctly. I'd say it's exactly that attitude that causes people to think .0 should be avoided. Beta versions are meant for test instances. It'd be good if people encouraged their use more often rather than pushing people to defer til GA. David
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
On Sat, 25 Oct 2025 at 04:51, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 10/24/25 08:00, Ron Johnson wrote:
> > On Fri, Oct 24, 2025 at 10:54 AM Adrian Klaver
> > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
> > "Never trust a .0 release with important data" is just as true in 2025
> > as it was in 1985.
> >
> > That's a chicken and egg problem, though, isn't it?
>
> There is nothing stopping you from setting up a test instance and
> kicking the tires on a new release to see if your setup will work
> correctly.
I'd say it's exactly that attitude that causes people to think .0
should be avoided. Beta versions are meant for test instances. It'd be
good if people encouraged their use more often rather than pushing
people to defer til GA.
On 10/24/25 15:37, David Rowley wrote: > On Sat, 25 Oct 2025 at 04:51, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> On 10/24/25 08:00, Ron Johnson wrote: >>> On Fri, Oct 24, 2025 at 10:54 AM Adrian Klaver >>> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: >> >>> "Never trust a .0 release with important data" is just as true in 2025 >>> as it was in 1985. >>> >>> That's a chicken and egg problem, though, isn't it? >> >> There is nothing stopping you from setting up a test instance and >> kicking the tires on a new release to see if your setup will work >> correctly. > > I'd say it's exactly that attitude that causes people to think .0 > should be avoided. Beta versions are meant for test instances. It'd be > good if people encouraged their use more often rather than pushing > people to defer til GA. 1) From previous posts to this list folks have mentioned their organizations prohibit touching anything less then a GA or maybe a late RC. That comes from on high and I doubt the folks issuing the orders are on this list. 2) The attitude comes from lessons learned in the School of Hard Knocks. Until someone or someones can guarantee a new GA release will not eat your data or spring security leaks then the prudent thing to do is wait to see what happens when it hits the world at large. I learned this lesson, pitfalls of jumping into something new, across fields outside of software as well. In other words 'new and improved' is not always the case, see 737 MAX as case in point. 3) Progress happens and you need to keep up. A little caution is good thing though, especially if you are the one who is being held responsible for any adverse outcomes. > > David -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, 25 Oct 2025 at 13:40, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 10/24/25 15:37, David Rowley wrote: > > On Sat, 25 Oct 2025 at 04:51, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> > >> On 10/24/25 08:00, Ron Johnson wrote: > > I'd say it's exactly that attitude that causes people to think .0 > > should be avoided. Beta versions are meant for test instances. It'd be > > good if people encouraged their use more often rather than pushing > > people to defer til GA. > > 1) From previous posts to this list folks have mentioned their > organizations prohibit touching anything less then a GA or maybe a late > RC. That comes from on high and I doubt the folks issuing the orders are > on this list. That seems bizarre to me. If they want new releases of PostgreSQL to be as stable as possible as early as possible, then beta and RC are much better times to test than .0 is. > 2) The attitude comes from lessons learned in the School of Hard Knocks. > Until someone or someones can guarantee a new GA release will not eat > your data or spring security leaks then the prudent thing to do is wait > to see what happens when it hits the world at large. I learned this > lesson, pitfalls of jumping into something new, across fields outside of > software as well. In other words 'new and improved' is not always the > case, see 737 MAX as case in point. I don't see why this reason is applicable at all to my statement. I didn't state that everyone should go and run with .0. I said we shouldn't encourage people to test beta and RC versions, as if they don't do that then .0 won't be as stable as if they did test (and report issues). It seems like simple cause and effect to me. > 3) Progress happens and you need to keep up. A little caution is good > thing though, especially if you are the one who is being held > responsible for any adverse outcomes. We're talking test servers here. I assume they can be recreated without too much trouble. David
David Rowley <dgrowleyml@gmail.com> writes:
> On Sat, 25 Oct 2025 at 13:40, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> 1) From previous posts to this list folks have mentioned their
>> organizations prohibit touching anything less then a GA or maybe a late
>> RC. That comes from on high and I doubt the folks issuing the orders are
>> on this list.
> That seems bizarre to me. If they want new releases of PostgreSQL to
> be as stable as possible as early as possible, then beta and RC are
> much better times to test than .0 is.
I think the folks issuing that sort of order believe that testing is
Somebody Else's Problem. The folly of that approach is pretty evident
to those of us toiling in the software trenches, but maybe not so much
from the C-suite.
regards, tom lane
On Sat, 25 Oct 2025 at 14:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > On Sat, 25 Oct 2025 at 13:40, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > >> 1) From previous posts to this list folks have mentioned their > >> organizations prohibit touching anything less then a GA or maybe a late > >> RC. That comes from on high and I doubt the folks issuing the orders are > >> on this list. > > > That seems bizarre to me. If they want new releases of PostgreSQL to > > be as stable as possible as early as possible, then beta and RC are > > much better times to test than .0 is. > > I think the folks issuing that sort of order believe that testing is > Somebody Else's Problem. The folly of that approach is pretty evident > to those of us toiling in the software trenches, but maybe not so much > from the C-suite. Yeah, I expect that'll be true all too often. However, it doesn't mean that people around here can't attempt to propagate an education to them on why they're wrong. There certainly is much less hope if they only receive confirmation. David
On 10/24/25 18:06, David Rowley wrote: > On Sat, 25 Oct 2025 at 13:40, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> 2) The attitude comes from lessons learned in the School of Hard Knocks. >> Until someone or someones can guarantee a new GA release will not eat >> your data or spring security leaks then the prudent thing to do is wait >> to see what happens when it hits the world at large. I learned this >> lesson, pitfalls of jumping into something new, across fields outside of >> software as well. In other words 'new and improved' is not always the >> case, see 737 MAX as case in point. > > I don't see why this reason is applicable at all to my statement. I > didn't state that everyone should go and run with .0. I said we > shouldn't encourage people to test beta and RC versions, as if they > don't do that then .0 won't be as stable as if they did test (and > report issues). It seems like simple cause and effect to me. I am not following, from your previous post: "Beta versions are meant for test instances. It'd be good if people encouraged their use more often rather than pushing people to defer til GA" That seems to be the opposite of what you say above. > >> 3) Progress happens and you need to keep up. A little caution is good >> thing though, especially if you are the one who is being held >> responsible for any adverse outcomes. > > We're talking test servers here. I assume they can be recreated > without too much trouble. Yes, but the OP was talking about upgrading a production database directly to 18. That was what my reply was referring to and what I was counseling against. > > David -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, 25 Oct 2025 at 17:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > I am not following, from your previous post: > > "Beta versions are meant for test instances. It'd be > good if people encouraged their use more often rather than pushing > people to defer til GA" > > That seems to be the opposite of what you say above. I think you think that because you misunderstood what I said in [1]. I'll rephrase it for you: Because people promote the .0 as not yet production-ready, it means that fewer people bother testing with beta and RC versions. Lack of beta testing is what causes .0 to contain more bugs than it otherwise might, so my suggestion is that we should be encouraging people to run beta and RC in their test environments to try to increase the stability of .0 versions. I struggle to imagine anyone with any respect for the PostgreSQL project disagreeing with that, so I suspect you and Johnson must have misunderstood. Also, I was highlighting that I agreed with the paradox pointed out by Ron. Clear? > >> 3) Progress happens and you need to keep up. A little caution is good > >> thing though, especially if you are the one who is being held > >> responsible for any adverse outcomes. > > > > We're talking test servers here. I assume they can be recreated > > without too much trouble. > > Yes, but the OP was talking about upgrading a production database > directly to 18. That was what my reply was referring to and what I was > counseling against. You should pay more attention to the quotes above my reply. I don't see any of my replies quoting anything about upgrading a production database to 18.0. David [1] https://postgr.es/m/CAApHDvo5pLSwD4JVxW8t94Vc88djKrrSyrL-GbQJ=hb2NVT5zg@mail.gmail.com
On 10/24/25 18:53, Tom Lane wrote: > David Rowley <dgrowleyml@gmail.com> writes: >> On Sat, 25 Oct 2025 at 13:40, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >>> 1) From previous posts to this list folks have mentioned their >>> organizations prohibit touching anything less then a GA or maybe a late >>> RC. That comes from on high and I doubt the folks issuing the orders are >>> on this list. > >> That seems bizarre to me. If they want new releases of PostgreSQL to >> be as stable as possible as early as possible, then beta and RC are >> much better times to test than .0 is. > > I think the folks issuing that sort of order believe that testing is > Somebody Else's Problem. The folly of that approach is pretty evident > to those of us toiling in the software trenches, but maybe not so much > from the C-suite. Yes and no. Yes someone needs to test new software, no that is not for everyone. The presence and uptake of LTS software indicates that being on the edge of development is something folks would like to avoid. I know as I use LTS versions of Ubuntu and Django for that very reason, I want to get work done without worry about having to pick up 'shiny bits and pieces'. That being said, it allows me the time to kick the tires on new software, of current interest Polars and Duckdb. It comes down to time available and what you want to spend that time on. That could very well mean making testing 'Somebody Else's Problem', in the way that Postgresql Core has made packaging some else's problem: https://www.postgresql.org/message-id/427155.1755538776@sss.pgh.pa.us " The core project wouldn't document that, because we just ship source code not RPMs. There might be a README in the core postgresql RPM package telling you what's what. " > > regards, tom lane -- Adrian Klaver adrian.klaver@aklaver.com
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
On 2025-10-25 17:50:59 +1300, David Rowley wrote:
> On Sat, 25 Oct 2025 at 17:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> > I am not following, from your previous post:
> >
> > "Beta versions are meant for test instances. It'd be
> > good if people encouraged their use more often rather than pushing
> > people to defer til GA"
> >
> > That seems to be the opposite of what you say above.
>
> I think you think that because you misunderstood what I said in [1].
> I'll rephrase it for you:
>
> Because people promote the .0 as not yet production-ready, it means
> that fewer people bother testing with beta and RC versions. Lack of
> beta testing is what causes .0 to contain more bugs than it otherwise
> might, so my suggestion is that we should be encouraging people to run
> beta and RC in their test environments to try to increase the
> stability of .0 versions.
Yes, but in your previous message you wrote:
| I said we shouldn't encourage people to test beta and RC versions,
Note: "shouldn't" instead of "should". Which is exactly the opposite.
I (apparently correctly) assumed that this was a typo, but that wasn't
entirely clear and Adrian obviously took it at face value.
> > > We're talking test servers here.
> >
> > Yes, but the OP was talking about upgrading a production database
> > directly to 18. That was what my reply was referring to and what I was
> > counseling against.
>
> You should pay more attention to the quotes above my reply. I don't
> see any of my replies quoting anything about upgrading a production
> database to 18.0.
Well, You came into this thread replying to Adrian, so maybe you should
have paid more attention to what he was (originally) replying to? Or at
least make it clearer that you are changing the subject? Something like
"I agree in this specific case, but in general ..." is often helpful to
alert people that you aren't commenting on the specific problem they are
currently thinking about.
hjp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Вложения
On 10/24/25 21:50, David Rowley wrote: > On Sat, 25 Oct 2025 at 17:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> I am not following, from your previous post: >> >> "Beta versions are meant for test instances. It'd be >> good if people encouraged their use more often rather than pushing >> people to defer til GA" >> >> That seems to be the opposite of what you say above. > > I think you think that because you misunderstood what I said in [1]. > I'll rephrase it for you: > > Because people promote the .0 as not yet production-ready, it means > that fewer people bother testing with beta and RC versions. Lack of > beta testing is what causes .0 to contain more bugs than it otherwise > might, so my suggestion is that we should be encouraging people to run > beta and RC in their test environments to try to increase the > stability of .0 versions. Alright that I understand, though not necessarily agree with. I would say lack of testing has more to do with time/money management. Organizations don't want to spend either until: 1) They see the dust settle on what is going to end up in the release. 2) Whether there is anything interesting enough to invest both in moving to a new release. Maybe there is a compelling argument that can be made to get those organizations off the fence. I just don't what it is as you would have to convince them to spend time and money rather then just wait and let the community as a whole do the work. > > I struggle to imagine anyone with any respect for the PostgreSQL > project disagreeing with that, so I suspect you and Johnson must have > misunderstood. No we didn't misunderstand, we where responding to what the OP was proposing which was jumping a production instance from 11 --> 18. That is a different case then promoting testing of in beta's and rc's. > David -- Adrian Klaver adrian.klaver@aklaver.com
On 10/24/25 21:50, David Rowley wrote:
> On Sat, 25 Oct 2025 at 17:36, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> I am not following, from your previous post:
>>
>> "Beta versions are meant for test instances. It'd be
>> good if people encouraged their use more often rather than pushing
>> people to defer til GA"
>>
>> That seems to be the opposite of what you say above.
>
> I think you think that because you misunderstood what I said in [1].
> I'll rephrase it for you:
>
> Because people promote the .0 as not yet production-ready, it means
> that fewer people bother testing with beta and RC versions. Lack of
> beta testing is what causes .0 to contain more bugs than it otherwise
> might, so my suggestion is that we should be encouraging people to run
> beta and RC in their test environments to try to increase the
> stability of .0 versions.
Alright that I understand, though not necessarily agree with. I would
say lack of testing has more to do with time/money management.
Organizations don't want to spend either until: 1) They see the dust
settle on what is going to end up in the release. 2) Whether there is
anything interesting enough to invest both in moving to a new release.
Maybe there is a compelling argument that can be made to get those
organizations off the fence. I just don't what it is as you would have
to convince them to spend time and money rather then just wait and let
the community as a whole do the work.\
Re: Index corruption issue after migration from RHEL 7 to RHEL 9 (PostgreSQL 11 streaming replication)
On 2025-10-25 10:39:35 -0400, Ron Johnson wrote:
> On Sat, Oct 25, 2025 at 10:21 AM Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>
> On 10/24/25 21:50, David Rowley wrote:
> > Because people promote the .0 as not yet production-ready, it means
> > that fewer people bother testing with beta and RC versions. Lack of
> > beta testing is what causes .0 to contain more bugs than it otherwise
> > might, so my suggestion is that we should be encouraging people to run
> > beta and RC in their test environments to try to increase the
> > stability of .0 versions.
>
> Alright that I understand, though not necessarily agree with. I would
> say lack of testing has more to do with time/money management.
> Organizations don't want to spend either until: 1) They see the dust
> settle on what is going to end up in the release. 2) Whether there is
> anything interesting enough to invest both in moving to a new release.
> Maybe there is a compelling argument that can be made to get those
> organizations off the fence. I just don't what it is as you would have
> to convince them to spend time and money rather then just wait and let
> the community as a whole do the work.\
>
>
> Contractual requirements to not run EOL software are a strong motivator to
> migrate to newer versions of OS and RDBMS.
But not to test beta or RC versions. With PostgreSQL's 5 year maintenance
window there is plenty of time to test with final versions before doing
the upgrade. For example, if a company adopted a policy of waitig for
the x.2 release, then test for 6 months before upgrading there production
servers they would still get 4 years of production use out of that major
version before having to upgrade again.
Personally, I usually use the current version for new databases (so if
I started a new project right now I'd use 18.0), but I'm conservative
about upgrades (so I usually don't upgrade to a new major version until
EOL is near and if I still had a 13.x, I'd probably go for 17.6 instead of
18.0). I also admit that I test beta versions only very rarely (basically
only if there is a new feature I'm really excited about).
hjp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Вложения
After reviewing the options, the logical replication approach seems to be the most feasible one with minimal downtime.
However, we currently have 7 streaming replication setups running from production, with a total database size of around 15 TB. Out of this, there are about 10 large tables ranging from 1 TB (max) to 50 GB (min) each, along with approximately 150+ sequences.
Could you please confirm if there are any successful case studies or benchmarks available for a similar setup?
Additionally, please share any recommended parameter tuning or best practices for handling logical replication at this scale.
Current server configuration:
CPU: 144 cores
RAM: 512 GB
On Thu, 23 Oct 2025 at 17:21, Greg Sabino Mullane <htamfids@gmail.com> wrotepg_dump is the most reliable, and the slowest. Keep in mind that only the actual data needs to move over (not the indexes, which get rebuilt after the data is loaded). You could also mix-n-match pg_logical and pg_dump if you have a few tables that are super large. Whether either approach fits in your 24 hour window is hard to say without you running some tests.Long time ago I had a similar problem and did a "running with scissors" restore. This means:1.- Prepare normal configuration, test, etc for the new version.2.- Prepare a restore configuration, with fsync=off, wallevel=minimal, whatever option gives you any speed advantage.As the target was empty, if restore failed we could just clean and restart.3.- Dump, boot with the restore configuration, restore, clean shutdown, switch to production configuration, boot again and follow on.Time has passed and I lost my notes, but I remember the restore was much faster than doing it with the normal production configuration. Given current machine speeds, it maybe doable.Francisco Olarte.