Обсуждение: More guidance on ctid
Hi, I'm wondering if the explanation of ctid in System Columns should have more guidance - especially as it's used in examples in the UPDATE and DELETE docs. So far I've been caught out by partitions and concurrent updates. This was after I changed primary keys to ctid, in self-joins that get ORDER BY or LIMIT, for UPDATE and DELETE. I'm too unsure about this idea and my knowledge of ctid to propose a patch, but FWIW, here's my attempt: Ctid is useful for removing duplicate rows and efficient self-joins. Be aware that when ctid is used to find a row, only the first concurrent write will be applied. Also note that ctids are not unique across table partitions. I'm learning my way through ctid, and the mailing lists, and am still in the foothills of both. Sorry about the topic-adjacent duplicates in other lists from my ramblings. Thanks, Bernice
On Sun, Nov 23, 2025 at 10:24:42AM +0000, Bernice Southey wrote:
> Hi,
> I'm wondering if the explanation of ctid in System Columns should have
> more guidance - especially as it's used in examples in the UPDATE and
> DELETE docs. So far I've been caught out by partitions and concurrent
> updates. This was after I changed primary keys to ctid, in self-joins
> that get ORDER BY or LIMIT, for UPDATE and DELETE.
>
> I'm too unsure about this idea and my knowledge of ctid to propose a
> patch, but FWIW, here's my attempt:
> Ctid is useful for removing duplicate rows and efficient self-joins.
> Be aware that when ctid is used to find a row, only the first
> concurrent write will be applied. Also note that ctids are not unique
> across table partitions.
>
> I'm learning my way through ctid, and the mailing lists, and am still
> in the foothills of both. Sorry about the topic-adjacent duplicates in
> other lists from my ramblings.
I didn't think we had ctid referenced in example queries, but I now see
it was added in PG 17:
commit 2daeba6a4e4
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun Apr 7 16:26:47 2024 -0400
Doc: show how to get the equivalent of LIMIT for UPDATE/DELETE.
Add examples showing use of a CTE and a self-join to perform
partial UPDATEs and DELETEs.
Corey Huinker, reviewed by Laurenz Albe
Discussion: https://postgr.es/m/CADkLM=caNEQsUwPWnfi2jR4ix99E0EJM_3jtcE-YjnEQC7Rssw@mail.gmail.com
The examples use ctid to show how you can do incremental UPDATEs and
DELETEs, to avoid excessive cleanup/vacuum requirements. While that
certainly is a good idea, it only works because the common table
expression examples, by definition, operate in the same snapshot.
I think the reporter above extrapolated this example to use ctid in
other, non-appropriate cases. Should we add a warning to the docs to
explain that the general use of ctid is discouraged?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Do not let urgent matters crowd out time for investment in the future.
Bruce Momjian <bruce@momjian.us> wrote: > I think the reporter above extrapolated this example to use ctid in > other, non-appropriate cases. Should we add a warning to the docs to > explain that the general use of ctid is discouraged? I've been explaining this badly. Here's an example that I hope will make clear why I'm so concerned. (Read Committed default isolation level) drop table if exists b1, b2; create table b1 as select 1 id, 100 balance; create table b2 as select 1 id, 100 balance; With an immutable column self-join, it works as expected and serializes the balance changes. --session1 begin; with x as (select id from b1 order by id for update) update b1 set balance = balance + 1 from x where b1.id = x.id; --session2 with x as (select id from b1 order by id for update) update b1 set balance = balance - 100 from x where b1.id = x.id; select * from b1; --session1 commit; --session2 UPDATE 1 id | balance ----+--------- 1 | 1 (1 row) But with ctid, the second update is lost. --session1 begin; with x as (select ctid from b2 order by id for update) update b2 set balance = balance + 1 from x where b2.ctid = x.ctid; --session2 with x as (select ctid from b2 order by id for update) update b2 set balance = balance - 100 from x where b2.ctid = x.ctid; select * from b2; --session1 commit; --session2 UPDATE 0 id | balance ----+--------- 1 | 101 (1 row) The session2 CTE blocks until it gets the updated version of the row from session 1 commit. It gets b2.ctid = (0,2) giving x.ctid = (0,2). But the UPDATE gets b2.ctid = (0,1) at the start of the transaction. This doesn't change even after session 1 commits. The join fails. This is correct, but not obvious. This is why I think the docs should say something different to what they currently do. The UPDATE example is fine only because it's run repeatedly until there are no more rows to find. I learned the primary key self-join pattern for order by/limit in delete/update soon after I began with postgres. I'm pretty sure ctid would have confused me. So I think the doc examples with an id column will be very helpful, and safer. If users discover ctid on internet forums or LLMs and then check the docs, I think they should get some notice of the concurrency risk. Here's a patch with an attempt to do the above. Forum examples: https://stackoverflow.com/questions/10245560/deadlocks-in-postgresql-when-running-update/71163671#71163671 https://www.reddit.com/r/PostgreSQL/comments/1cn0q1c/comment/l36ppve/ Thanks, Bernice