Обсуждение: Improve logical replication usability when tables lack primary keys

Поиск
Список
Период
Сортировка

Improve logical replication usability when tables lack primary keys

От
Chao Li
Дата:
* BACKGROUND

This requirement comes from several users operating large deployments, particularly in HIS (Hospital Information Systems). The situation can be summarized as follows:

- A central DB operations team maintains the main database and configures logical replication for all tables.
- Multiple third-party application vendors are allowed to create new tables in that database.
- Some of these newly created tables lack a primary key. Since logical replication with `REPLICATION IDENTITY DEFAULT` requires a primary key, such tables silently fail to replicate.
- The DB operations team must then spend significant effort identifying the affected tables and correcting them manually.

In practice, these environments would benefit from a safe fallback: if a table has no primary key, logical replication should automatically switch from `REPLICATION IDENTITY DEFAULT` to `FULL`, ensuring replication continues rather than breaking.

I don't intend to debate whether this operational model is ideal; it is simply the reality in many deployments. These database operations teams have developed and refined their practices over many years, and as a database vendor we have limited influence over how they manage their environments.

* PROPOSED SOLUTION

I evaluated a few approaches and am proposing the following:

- Introduce a new GUC: `logical_replication_fallback_to_full_identity`.
- When enabled, if a table being logically replicated has no primary key, the system automatically uses `REPLICATION IDENTITY FULL` for that table.
- This setting can be applied at the database level, so large systems do not need to enable it cluster-wide unless desired.
- When the WAL sender transmits relation metadata, if fallback has occurred, it explicitly reports `FULL` as the replication identity to the subscriber, so there is limited impact on the subscriber.

* NEXT STEPS

The attached patch is an initial implementation. It does not yet include tests or documentation updates. I would appreciate feedback on the design approach first. If the direction seems reasonable, I will proceed with refining the patch and adding documentation and tests.

Thanks in advance for your review.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Вложения

Re: Improve logical replication usability when tables lack primary keys

От
Amit Kapila
Дата:
On Mon, Nov 10, 2025 at 1:36 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> * BACKGROUND
>
> This requirement comes from several users operating large deployments, particularly in HIS (Hospital Information
Systems).The situation can be summarized as follows: 
>
> - A central DB operations team maintains the main database and configures logical replication for all tables.
> - Multiple third-party application vendors are allowed to create new tables in that database.
> - Some of these newly created tables lack a primary key. Since logical replication with `REPLICATION IDENTITY
DEFAULT`requires a primary key, such tables silently fail to replicate. 
> - The DB operations team must then spend significant effort identifying the affected tables and correcting them
manually.
>

Can you share an example of how we silently fail to replicate? Won't
in such cases UPDATE/DELETE will anyway raise an ERROR?

--
With Regards,
Amit Kapila.



Re: Improve logical replication usability when tables lack primary keys

От
"Euler Taveira"
Дата:
On Mon, Nov 10, 2025, at 5:06 AM, Chao Li wrote:
> I evaluated a few approaches and am proposing the following:
>
> - Introduce a new GUC: `logical_replication_fallback_to_full_identity`.
> - When enabled, if a table being logically replicated has no primary 
> key, the system automatically uses `REPLICATION IDENTITY FULL` for that 
> table.
> - This setting can be applied at the database level, so large systems 
> do not need to enable it cluster-wide unless desired.
> - When the WAL sender transmits relation metadata, if fallback has 
> occurred, it explicitly reports `FULL` as the replication identity to 
> the subscriber, so there is limited impact on the subscriber.
>

If I understand your proposal correctly, you want to add a new fallback to
replica identity. We already have a fallback for DEFAULT that means no primary
key is the same as NOTHING. I didn't like your proposal. It is too restrictive.

However, I see some usefulness in introducing a GUC default_replica_identity.
The proposal is similar to access method (default_table_access_method). The
DEFAULT option selects the replica identity sets as default_replica_identity
parameter. You need to add a new option (PRIMARY KEY); that should be the
default value. (If we don't want to break the backward compatibility, this new
option should fallback to NOTHING if there is no primary key. Another
alternative is to have a strict and non-strict option. I prefer the former.) Of
course, the USING INDEX option cannot be used. For pg_dump, you need to use SET
command to inform the default_replica_identity value so tables with the same
option as default_replica_identity doesn't emit an ALTER TABLE command.


-- 
Euler Taveira
EDB   https://www.enterprisedb.com/



Re: Improve logical replication usability when tables lack primary keys

От
Chao Li
Дата:
Hi Amit,

Thanks for asking.

> On Nov 11, 2025, at 19:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 10, 2025 at 1:36 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>
>> * BACKGROUND
>>
>> This requirement comes from several users operating large deployments, particularly in HIS (Hospital Information
Systems).The situation can be summarized as follows: 
>>
>> - A central DB operations team maintains the main database and configures logical replication for all tables.
>> - Multiple third-party application vendors are allowed to create new tables in that database.
>> - Some of these newly created tables lack a primary key. Since logical replication with `REPLICATION IDENTITY
DEFAULT`requires a primary key, such tables silently fail to replicate. 
>> - The DB operations team must then spend significant effort identifying the affected tables and correcting them
manually.
>>
>
> Can you share an example of how we silently fail to replicate? Won't
> in such cases UPDATE/DELETE will anyway raise an ERROR?
>

Yes, UPDATE/DELETE will fail. That’s the easy case to expose the error. Actually my patch will allow the update/delete.

However, some tables, like dictionary tables, they are important, but don’t have much update/delete, they may silently
failto replicate. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Improve logical replication usability when tables lack primary keys

От
Chao Li
Дата:

> On Nov 11, 2025, at 20:09, Euler Taveira <euler@eulerto.com> wrote:
>
> On Mon, Nov 10, 2025, at 5:06 AM, Chao Li wrote:
>> I evaluated a few approaches and am proposing the following:
>>
>> - Introduce a new GUC: `logical_replication_fallback_to_full_identity`.
>> - When enabled, if a table being logically replicated has no primary
>> key, the system automatically uses `REPLICATION IDENTITY FULL` for that
>> table.
>> - This setting can be applied at the database level, so large systems
>> do not need to enable it cluster-wide unless desired.
>> - When the WAL sender transmits relation metadata, if fallback has
>> occurred, it explicitly reports `FULL` as the replication identity to
>> the subscriber, so there is limited impact on the subscriber.
>>
>
> If I understand your proposal correctly, you want to add a new fallback to
> replica identity. We already have a fallback for DEFAULT that means no primary
> key is the same as NOTHING. I didn't like your proposal. It is too restrictive.
>
> However, I see some usefulness in introducing a GUC default_replica_identity.
> The proposal is similar to access method (default_table_access_method). The
> DEFAULT option selects the replica identity sets as default_replica_identity
> parameter. You need to add a new option (PRIMARY KEY); that should be the
> default value. (If we don't want to break the backward compatibility, this new
> option should fallback to NOTHING if there is no primary key. Another
> alternative is to have a strict and non-strict option. I prefer the former.) Of
> course, the USING INDEX option cannot be used. For pg_dump, you need to use SET
> command to inform the default_replica_identity value so tables with the same
> option as default_replica_identity doesn't emit an ALTER TABLE command.
>

Hi Euler,

Thank you very much for the valuable feedback. These are a lot of useful information. As I mentioned in my first email,
myproposal was just an initial implementation, I am open for discussion from the design perspective. 

Actually I explored the solution of adding a GUC for default_replication_identify. Let me briefly list solutions I
explored:

1. The first solution I explored was adding a GUC for replication_identify_fallback_method, possible options are
“nothing”and “full”. I gave up that because the solution is also an equivalent to the one I proposed of a bool option
(false->nothing,true->full) and a bool option is easier to use. 

2. Then I considered to add a GUC for default replication identity which is the same as you suggested. I gave up that
becausethis solution would require to update all existing tables’ replication identities. 

3. I also considered to add a new replication identity, I hadn't named it, but meaning was using primary key and
fallbackto full. I gave up that because it’s too much complicated than other solutions, and that would also required to
updateall existing tables’ replication identities. 

4. Finally I decided the one I proposed. The main reason I chose it is because 1) production deployments wouldn't need
toupdate existing table’s replication identity; 2) the change only needs to be applied in the wal-sender side; 3)
withoutturning on the GUC option, no any impact. 

Given there is a similar GUC option default_table_access_method (I wasn’t aware of that), I think 2 as you suggested
mightbe the direction to go along with. 

Let’s wait a few more days to see if other folks may comment as well.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/