Logical replication stall on box type with REPLICA IDENTITY FULL due to lack of exact equality

Поиск
Список
Период
Сортировка
От Laurence Parry
Тема Logical replication stall on box type with REPLICA IDENTITY FULL due to lack of exact equality
Дата
Msg-id VI0PR02MB107029CB01035F7025F1021EDA598A@VI0PR02MB10702.eurprd02.prod.outlook.com
обсуждение исходный текст
Список pgsql-general
Long story short, I tried setting REPLICA IDENTITY FULL on a table with a box type and it made replication stall on the receiver because - as I found out later - the predefined box type seems to lack a way to definitively determine exact equality of coordinates.

I was tempted to post this as a bug because - while discouraged - FULL works with other types, and from a user perspective based on the string output the box type seems like it could at least have hash-based equality, if not a btree total ordering.

The wording of https://www.postgresql.org/docs/current/logical-replication-publication.html also implies to me that FULL should work in all cases with default types: "If there are no such suitable indexes, the search on the subscriber side can be very inefficient, therefore replica identity FULL should only be used as a fallback if no other solution is possible." - so inefficient, but still working?

I might be missing something obvious, so I figured I'd bring it up here for advice. If not... well, it could've led to a production outage if I hadn't caught it in time, and this kind of issue seems likely to increase as logical replication becomes more popular.

Related reading:

--

We have a cascading replica setup in production; the replicas are used to offload CPU-intensive recommendation queries and disaster recovery. In the PostgreSQL 16 upgrade cycle we switched from physical to logical replication. This entailed a few gotchas - we have a decade-and-a-half old, rapidly-developed schema, and had to set up some primary keys or use other indexes for replica identity. In cases where there was no obvious candidate, and the tables weren't busy, we set REPLICA IDENTITY FULL.

One that we missed until now, as it is rarely changed, is content servers areas - a set of boxes in ascending priority order defining the geospatial areas within which users should be directed to particular content servers. (It was great to find that PostgreSQL had support for these - like inet/cidr and uuid - out of the box.)

fb=# \d content_servers_areas
             Table "public.content_servers_areas"
      Column       |   Type   | Collation | Nullable | Default
-------------------+----------+-----------+----------+---------
 content_server_id | smallint |           |          |
 priority          | smallint |           |          |
 area              | box      |           |          |

db=# SELECT * FROM content_servers_areas WHERE content_server_id = 10;
 content_server_id | priority |          area
-------------------+----------+------------------------
                10 |       53 | (90,-96),(-90,-170)
                10 |       65 | (90,-27),(-90,-170)
                10 |       35 | (48,-81.5),(35,-105.5)
                10 |       35 | (90,-81.5),(48,-128)
                10 |        5 | (48,-84.8),(37,-99)
(5 rows)

The use of logical replication became an issue when we wanted to alter an existing area:

db=# UPDATE content_servers_areas SET area = '(48,-89),(37,-99)' WHERE area = '(48,-84.8),(37,-99)';
ERROR:  cannot update table "content_servers_areas" because it does not have a replica identity and publishes updates
HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
db=# ALTER TABLE content_servers_areas REPLICA IDENTITY FULL;
ALTER TABLE

This seemed to work, right up to the point our Munin monitoring infrastructure trigged a critical warning for high WAL levels on the sender a few hours later. Looking at the replica logs, the issue became obvious.

2023-12-26 01:55:49.849 UTC,,,2752638,,658a32a5.2a007e,1,,2023-12-26 01:55:49 UTC,8/310000,0,LOG,00000,"logical replication apply worker for subscription ""replica"" has started",,,,,,,,"InitializeApplyWorker, worker.c:4514","","logical replication worker",,0
2023-12-26 01:55:49.930 UTC,,,2752638,,658a32a5.2a007e,2,,2023-12-26 01:55:49 UTC,8/0,0,ERROR,42883,"could not identify an equality operator for type box",,,,,"processing remote data for replication origin ""pg_18583"" during message type ""UPDATE"" for replication target relation ""public.content_servers_areas"" in transaction 256215511, finished at 244/5DE85B48",,,"tuples_equal, execReplication.c:309","","logical replication worker",,0
2023-12-26 01:55:49.932 UTC,,,1314286,,65819479.140dee,13166,,2023-12-19 13:02:49 UTC,,0,LOG,00000,"background worker ""logical replication worker"" (PID 2752638) exited with exit code 1",,,,,,,,"LogChildExit, postmaster.c:3660","","postmaster",,0

ALTER SUBSCRIPTION replica SKIP (lsn = '244/5DE85B48'); worked to clear this - I had to do it a few times as we'd made further changes.

Unfortunately I can't do the obvious:
db=# CREATE UNIQUE INDEX content_server_id_area ON content_servers_areas (content_server_id, area);
ERROR:  data type box has no default operator class for access method "btree"
HINT:  You must specify an operator class for the index or define a default operator class for the data type

Indeed, this appears to be the case: 
db=# SELECT am.amname AS index_method,
       opc.opcname AS opclass_name,
       opc.opcintype::regtype AS indexed_type,
       opc.opcdefault AS is_default
    FROM pg_am am, pg_opclass opc
    WHERE opc.opcmethod = am.oid and opc.opcintype::regtype::text = 'box'
    ORDER BY index_method, opclass_name;
 index_method |   opclass_name    | indexed_type | is_default
--------------+-------------------+--------------+------------
 brin         | box_inclusion_ops | box          | t
 gist         | box_ops           | box          | t
 spgist       | box_ops           | box          | t
(3 rows)

And per https://www.postgresql.org/docs/devel/sql-createindex.html#id-1.9.3.69.7 "Only b-tree currently supports unique indexes." (Even in devel? I thought hash was meant to be an option... I guess https://www.postgresql.org/message-id/flat/CAHut%2BPuzhADSevvypRikx4mtT_1B6_14Z1iCcfhEYVM5EsgGGQ%40mail.gmail.com#91a2c03066f1df409d6101f3de84cfbf is more about using hash in replica identity rather than unique indexes.)
...so you get `ERROR:  access method "gist" does not support unique indexes` if you try USING gist().

I get the impression that PostGIS might be able to handle this:
...but I'm hoping to avoid adding a whole geospatial solution just for this.

This seems like it should work but doesn't:
db=# CREATE UNIQUE INDEX content_server_id_area ON content_servers_areas (content_server_id, CAST(area AS text));
CREATE INDEX
db=# ALTER TABLE content_servers_areas REPLICA IDENTITY USING INDEX content_server_id_area;
ERROR:  cannot use expression index "content_server_id_area" as replica identity

The dubious 'content_servers_areas (content_server_id, AREA(area));' fails likewise; and while ~= might be helpful, I can't see how to use it for this purpose - maybe a trigger-updated column? Although at this point I'm minded to just remove it from replication.

It was said in one of the above threads that "there's no plausible linear ordering of boxes", but coordinates are already reordered into (top right, bottom left) when set, so perhaps treating the individual elements of each coordinate pair as numbers might work? Failing that, is a hash operator possible? Maybe negative zero is an issue? ( https://www.virtualdub.org/blog2/entry_259.html )

I was also a little disappointed that I didn't get a warning about using REPLICA IDENTITY FULL to start with, as this confirmed my assumption that it should work, just inefficiently. I appreciate what matters is the indexes on the replica, which may differ from the source, but in an ideal world it might've warned "if you try to apply a full row with the indexes defined on this end, it won't work".

Best wishes for the holiday,
-- 
Laurence 'GreenReaper' Parry

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Query crash with 15.5 on debian bookworm/armv8
Следующее
От: Kaushik Iska
Дата:
Сообщение: Intermittent Issue with WAL Segment Removal in Logical Replication