Обсуждение: BUG #17143: when the CPU is different, the index on the primary is ok but the index on the standby is damaged

Поиск
Список
Период
Сортировка

BUG #17143: when the CPU is different, the index on the primary is ok but the index on the standby is damaged

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17143
Logged by:          lcj
Email address:      lcj122@163.com
PostgreSQL version: 11.5
Operating system:   liunx
Description:

Hello, I have encountered a problem in building a primary and standby
cluster using physical streaming replication on different CPU machines.
We created the primary on the x86 machine, and created the standby on the
arm machine. Now using amcheck to check the index, we find that the same
index is no problem on the primary, but the standby is indeed damaged.
The error reported on the standby machine is as follows:
ERROR:  item order invariant violated for index "xxxx"
DETAIL:  Lower index tid=(965,50) (points to heap tid=(13502,8)) higher
index tid=(965,51) (points to heap tid=(40017,19)) page lsn=392/59A2C8.

Does anyone know the reason? thanks.


Re: BUG #17143: when the CPU is different, the index on the primary is ok but the index on the standby is damaged

От
"David G. Johnston"
Дата:
On Fri, Aug 13, 2021 at 7:56 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17143
Logged by:          lcj
Email address:      lcj122@163.com
PostgreSQL version: 11.5
Operating system:   liunx
Description:       

Hello, I have encountered a problem in building a primary and standby
cluster using physical streaming replication on different CPU machines.
We created the primary on the x86 machine, and created the standby on the
arm machine. Now using amcheck to check the index, we find that the same
index is no problem on the primary, but the standby is indeed damaged.
The error reported on the standby machine is as follows:
ERROR:  item order invariant violated for index "xxxx"
DETAIL:  Lower index tid=(965,50) (points to heap tid=(13502,8)) higher
index tid=(965,51) (points to heap tid=(40017,19)) page lsn=392/59A2C8.

Does anyone know the reason? thanks.


I cannot explain the technical reason for this well but physical replication is not supported, and not generally expected to work, when the primary and secondary are not essentially the same.  A different cpu architecture is decidedly a material difference.  In particular, this property falls into "hardware architecture" (it is in fact the property around which "architecture" is defined) which is specifically called out in the docs:


"""
Hardware need not be exactly the same, but experience shows that maintaining two identical systems is easier than maintaining two dissimilar ones over the lifetime of the application and system. In any case the hardware architecture must be the same — shipping from, say, a 32-bit to a 64-bit system will not work.
"""

David J.
PG Bug reporting form <noreply@postgresql.org> writes:
> Hello, I have encountered a problem in building a primary and standby
> cluster using physical streaming replication on different CPU machines.
> We created the primary on the x86 machine, and created the standby on the
> arm machine.

You should not expect that to work.  Physical replication requires
identical hardware architectures, not to mention mostly-the-same
operating system software.  (The most usual gotcha for cross-system
replication is different locale sort orders.)

            regards, tom lane