Обсуждение: making tid and HOTness of UPDATE available to logical decoding plugins

Поиск
Список
Период
Сортировка

making tid and HOTness of UPDATE available to logical decoding plugins

От
Hannu Krosing
Дата:
Please find attached a patch that makes tuple ids and info about
weather it was plain or HOT update available to logical decoding
callbacks.

Also modified test_decoding to show both tids -
- old tid has format -(pageno, slot)
- new tid has format +(pageno, slot)
if it is a HOT update, it is decoded prefixed with 'HOT '

Sample usage:

hannu=# SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');
pg_create_logical_replication_slot
------------------------------------
(test_slot,0/1BF1B38)
(1 row)
hannu=# CREATE TABLE nokey(data text);
CREATE TABLE
hannu=# insert into nokey (data) values('a');
INSERT 0 1
hannu=# update nokey set data = 'b';
UPDATE 1
hannu=# delete from nokey ;
DELETE 1
hannu=# SELECT lsn, xid, data FROM
pg_logical_slot_get_changes('test_slot', NULL, NULL);
lsn | xid | data
-----------+-----+------------------------------------------------------------
0/1C20538 | 767 | BEGIN 767
0/1C2B1E8 | 767 | COMMIT 767
0/1C2B220 | 768 | BEGIN 768
0/1C2B220 | 768 | table public.nokey: INSERT:+(0,1) data[text]:'a'
0/1C2B290 | 768 | COMMIT 768
0/1C2B300 | 769 | BEGIN 769
0/1C2B300 | 769 | table public.nokey: HOT UPDATE:-(0,1)+(0,2) data[text]:'b'
0/1C2B378 | 769 | COMMIT 769
0/1C2B3B0 | 770 | BEGIN 770
0/1C2B3B0 | 770 | table public.nokey: DELETE:-(0,2) (no-tuple-data)
0/1C2B418 | 770 | COMMIT 770
(11 rows)

My planned use case is for reliable logical replication of tables
without primary key or other declared IDENTITY (as long as there are
no updates on target, or at leas no non-hot updates)

Sending thgis part as an independent patch as there may be other
interesting use cases as well.

--
Hannu

Вложения

Re: making tid and HOTness of UPDATE available to logical decoding plugins

От
Mihail Nikalayeu
Дата:
Hello!

Thanks for the patch.

Few comments:

1) tests are failing, expected output files need to be updated

2)
> * Treat HOT update as normal updates. There is no useful
> * information in the fact that we could make it a HOT update
> * locally and the WAL layout is compatible.

I think it feels a little bit irrelevant now. Also, I'll prefer to
give XLOG_HEAP_HOT_UPDATE a dedicated case switch.

3) _format_tid - not sure _ prefix is a good idea here, but not sure.

4) new double newlines before and after _format_tid

5)
 > if (change->data.tp.newctid.ip_posid)
Should we change it to
if (ItemPointerIsValid(&change->data.tp.newctid))

Best regards,
Mikhail.



Re: making tid and HOTness of UPDATE available to logical decoding plugins

От
Hannu Krosing
Дата:
On Thu, Dec 4, 2025 at 11:35 PM Mihail Nikalayeu
<mihailnikalayeu@gmail.com> wrote:
>
> Hello!
>
> Thanks for the patch.

Thanks for reviewing :)

> Few comments:
>
> 1) tests are failing, expected output files need to be updated

Yes, I didn't want to start changing tests before other parts stabilized a bit.

> 2)
> > * Treat HOT update as normal updates. There is no useful
> > * information in the fact that we could make it a HOT update
> > * locally and the WAL layout is compatible.
>
> I think it feels a little bit irrelevant now. Also, I'll prefer to
> give XLOG_HEAP_HOT_UPDATE a dedicated case switch.

I added it after the comments about the need to track HOT in case we
would use it for collecting index entries.

As it is implemented now it should not affect any users who are not
interested in HOT.

> 3) _format_tid - not sure _ prefix is a good idea here, but not sure.

yeah, need to look around a little for established use in these parts
of the code.

> 4) new double newlines before and after _format_tid

ack


> 5)
>  > if (change->data.tp.newctid.ip_posid)
> Should we change it to
> if (ItemPointerIsValid(&change->data.tp.newctid))

I'll have to check. It looks likely that the whole check is redundant
and the tids are always available

> Best regards,
> Mikhail.



Re: making tid and HOTness of UPDATE available to logical decoding plugins

От
"Euler Taveira"
Дата:
On Thu, Dec 4, 2025, at 5:58 PM, Hannu Krosing wrote:
> Please find attached a patch that makes tuple ids and info about
> weather it was plain or HOT update available to logical decoding
> callbacks.
>

My first impression was: why do you want to expose an internal information that
is mostly useless for a broader audience? The logical decoding infrastructure
is a general purpose solution for streaming modifications made to Postgres.
Could you elaborate how other consumers (DBMS, data store, ...) would use it?

> My planned use case is for reliable logical replication of tables
> without primary key or other declared IDENTITY (as long as there are
> no updates on target, or at leas no non-hot updates)
>

Wait, we already have a mechanism to handle it: replica identity.  What is the
advantage of this proposal in comparison with replica identity?

It seems a Postgres-centric solution that you didn't provide strong arguments
in favor of it. How would logical replication take advantage of such change? If
that's the case, share the pgoutput and logical replication changes.


-- 
Euler Taveira
EDB   https://www.enterprisedb.com/



Re: making tid and HOTness of UPDATE available to logical decoding plugins

От
Hannu Krosing
Дата:
On Fri, Dec 5, 2025 at 3:58 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Thu, Dec 4, 2025, at 5:58 PM, Hannu Krosing wrote:
> > Please find attached a patch that makes tuple ids and info about
> > weather it was plain or HOT update available to logical decoding
> > callbacks.
> >
>
> My first impression was: why do you want to expose an internal information that
> is mostly useless for a broader audience? The logical decoding infrastructure
> is a general purpose solution for streaming modifications made to Postgres.
> Could you elaborate how other consumers (DBMS, data store, ...) would use it?

One "other consumer" that came up was possibility to use logical
decoding for collecting changes for CREATE INDEX CONCURRENTLY so there
would be no need for 2nd pass of CIC to scan the whole table again.

I understand that there already is an ongoing work to do this with a
specialized collector, but that involved some other ugliness like
having to use a specialized logging index acces methods.

And tracking changes for other CONCURRENTLY operations, like table
repack, could also benefit from having ctid and hotness info.

> > My planned use case is for reliable logical replication of tables
> > without primary key or other declared IDENTITY (as long as there are
> > no updates on target, or at leas no non-hot updates)
> >
>
> Wait, we already have a mechanism to handle it: replica identity.  What is the
> advantage of this proposal in comparison with replica identity?

Replica identity full can become a quite heavyweight operation if you
just want to set up logical replication but your table has no primary
key but still has occasional updates

If all you want to do is to be able to replicate UPDATEs and DELETEs
then having to save full tuple data in WAL seems excessive.

> It seems a Postgres-centric solution that you didn't provide strong arguments
> in favor of it. How would logical replication take advantage of such change? If
> that's the case, share the pgoutput and logical replication changes.

Having though about the issue for quite some time I suddenly
discovered, that while ctid can not be used as a permanent enough
unique id for foreign keys or anything external, it is unique at any
moment in time making it very much sufficient for logical replication.

The high-level idea is to store the source (publisher) ctid value in
an extra column for sorce_ctid in the target (subscriber) table, that
column will also have a unique index and is of course NOT NULL (as
there can be by definition no row without a ctid) so it will form kind
of "replication primary key".

During CDC replay phase each change is sent with ctid (or two in case
of UPDATE) and the replay works as it currently does with the addition
of sorce ctid being stored in sorce_ctid column on the target.

And because UPDATEalso updates the source_ctid colum on target the
"replication primary key" stays nicely in sync.

Of course a manual update in the target database could break
replication , but this is no different than IDENTITY FULL. or for that
matter any other IDENTITY.

So the PoC I am working on will

- add a "materialised sorce ctid" column to target table, defined as
"source_ctid tid NOT NULL UNIQUE"
- initial copy will copy over `SELECT *, ctid as source_ctid FROM ...`
- replication decoding plugin will include actual ctid(s) in change records

For the above PoC the replay part needs no changes beyond knowing that
source_ctid is the identity column


PoC phase 2 will be more complex and will introduce the "index-only
source_ctid column" to avoid bloating the table by storing source
ctids there if the sole purpose of the replication is migrating the
database, But more on this once I have the basic PoC working :)

--
Cheers
Hannu

P.S: I am also mulling over an idea of adding semi-virtual GENERATED
ALWAYS AS ROW IDENTITY where the ROW identity starts as bigint cast of
actual ctid and gets materialized only on (non-HOT) update. This does
not need this logical decoding patch, but as it is closely related I
mention it here as well.
ROW IDENTITY has two big advantages over other identity types for
mostly static tables -
a) identity column takes up no extra space and
b) it allows super fast direct lookups without needing an index at all
for fully write-only tables or a quick index lookup in a tiny index to
check that the ROWID is not there and then direct lookup by ctid.

--
Hannu