Обсуждение: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

Поиск
Список
Период
Сортировка
Hello,We’re seeing a backend segfault when an ON UPDATE CASCADE fires across partitions, if the destination partition was created via CREATE TABLE … LIKE + ATTACH (so its physical tuple descriptor differs from the parent due to dropped-column tombstones/attnum drift). Names/types match by inspection, but the crash occurs during tuple materialization in the RI trigger execution.
Minimal Reproducer (self-contained)

DROP SCHEMA IF EXISTS t CASCADE;
CREATE SCHEMA t;

-- Pipelines (partitioned)
CREATE TABLE t.pipelines (
  partition_id int NOT NULL,
  id           bigint NOT NULL,
  PRIMARY KEY (partition_id, id)
) PARTITION BY LIST (partition_id);
CREATE TABLE t.pipelines_102 PARTITION OF t.pipelines FOR VALUES IN (102);
CREATE TABLE t.pipelines_50  PARTITION OF t.pipelines FOR VALUES IN (50);

-- Stages (partitioned) with ON UPDATE CASCADE to pipelines.
-- Create a mid column and drop it to leave a tombstone gap in attnums.
CREATE TABLE t.stages (
  partition_id int NOT NULL,
  id           bigint NOT NULL,
  tmp_mid      int,               -- dropped below, leaves parent attnum gap
  pipeline_id  bigint NOT NULL,
  name         text,
  status       int,
  PRIMARY KEY (partition_id, id),
  FOREIGN KEY (partition_id, pipeline_id)
    REFERENCES t.pipelines(partition_id, id)
    ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.stages_102 PARTITION OF t.stages FOR VALUES IN (102);
ALTER TABLE t.stages DROP COLUMN tmp_mid;

-- Miscreate destination stage partition via LIKE + ATTACH (no tombstone, different attnums).
CREATE TABLE t.stages_50_like (LIKE t.stages INCLUDING DEFAULTS);
ALTER TABLE t.stages ATTACH PARTITION t.stages_50_like FOR VALUES IN (50);

-- Builds (partitioned), cascades to both stages and pipelines.
CREATE TABLE t.builds (
  partition_id int NOT NULL,
  id           bigint NOT NULL,
  stage_id     bigint NOT NULL,
  commit_id    bigint NOT NULL,
  PRIMARY KEY (partition_id, id),
  FOREIGN KEY (partition_id, stage_id)
    REFERENCES t.stages(partition_id, id)
    ON UPDATE CASCADE ON DELETE CASCADE,
  FOREIGN KEY (partition_id, commit_id)
    REFERENCES t.pipelines(partition_id, id)
    ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.builds_102 PARTITION OF t.builds FOR VALUES IN (102);
CREATE TABLE t.builds_50  PARTITION OF t.builds FOR VALUES IN (50);

-- Seed rows in source partition 102.
INSERT INTO t.pipelines_102(partition_id, id) VALUES (102, 1);
INSERT INTO t.stages_102   (partition_id, id, pipeline_id, name, status)
VALUES (102, 10, 1, 's', 0);
INSERT INTO t.builds_102   (partition_id, id, stage_id, commit_id)
VALUES (102, 100, 10, 1);

-- Crash repro: cascaded UPDATE across partitions
UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;


server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Postgres logs:
 
2025-10-16 09:23:08.535 UTC [18630] LOG:  client backend (PID 18673) was terminated by signal 11: Segmentation fault
2025-10-16 09:23:08.535 UTC [18630] DETAIL:  Failed process was running: UPDATE t.pipelines
        SET partition_id = 50
        WHERE partition_id = 102 AND id = 1;

Environment
  • PostgreSQL:
postgres=# SHOW server_version;
 server_version
----------------
 18.0
(1 row)

postgres=# SHOW server_version_num;
 server_version_num
--------------------
 180000
(1 row)

postgres=# SELECT version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 18.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), 64-bit
(1 row)

postgres=# SHOW shared_preload_libraries;
 shared_preload_libraries
--------------------------

(1 row)

postgres=# SELECT extname, extversion FROM pg_extension ORDER BY 1;
 extname | extversion
---------+------------
 plpgsql | 1.0
(1 row)


  • OS/Kernel/Libc:
[root@postgres-source ~]# uname -a
Linux postgres-source 5.14.0-427.22.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 19 04:14:38 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[root@postgres-source ~]# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="9.3"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Oracle Linux Server 9.3"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:9:3:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://github.com/oracle/oracle-linux"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 9"
ORACLE_BUGZILLA_PRODUCT_VERSION=9.3
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=9.3
[root@postgres-source ~]# ldd --version
ldd (GNU libc) 2.34
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Backrace in attachment 

Issue is reproducible at least in Postgres 16, 17, 18

Please let me know if I need to provide some other information 

BR Dmitry


Вложения
On Fri, 17 Oct 2025 at 10:53, Dmitry Fomin <fomin.list@gmail.com> wrote:
> -- Crash repro: cascaded UPDATE across partitions
> UPDATE t.pipelines
> SET partition_id = 50
> WHERE partition_id = 102 AND id = 1;
>
>
> server closed the connection unexpectedly

Thanks for the detailed report. It seems to have been caused by
ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
isn't set for this partition which causes ExecGetChildToRootMap() to
think no translation is required.

More to come...

David



On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote:
> Thanks for the detailed report. It seems to have been caused by
> ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
> isn't set for this partition which causes ExecGetChildToRootMap() to
> think no translation is required.

The problem is with the ResultRelInfo caching that's in
ExecGetTriggerResultRel(). The code there tries looking into the
estate's es_opened_result_relations, es_tuple_routing_result_relations
and es_trig_target_relations Lists to see if the ResultRelInfo was
created before. In the problem case the ResultRelInfo is found in
es_trig_target_relations and unfortunately it's been set up by some
code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
This means when ExecGetTriggerResultRel() is called again this time
passing the correct rootRelInfo, the cached one that has the NULL
ri_RootResultRelInfo is found and returned.

This results in the ExecGetChildToRootMap() code doing the wrong thing
because it sees a NULL ri_RootResultRelInfo therefore does not
translate the slot into the slow format of the partitioned table.

I've attached a patch which fixes the problem. I'm just not sure if
it's the right fix for the problem. I suspect the real problem is down
to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
in the first place. I just don't see a good way to figure out what the
parent table should be so we know to create a parent ResultRelInfo as
the trigger that is firing is for the partition, not the partitioned
table. I don't see any way to figure out that the trigger is being
fired because it's cascading an update of its parent partitioned
table... At a guess, it feels like there might be some fields missing
in AfterTriggerShared to figure this out.

Any thoughts on this Amit?

David

Вложения
On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote:
> > Thanks for the detailed report. It seems to have been caused by
> > ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
> > isn't set for this partition which causes ExecGetChildToRootMap() to
> > think no translation is required.
>
> The problem is with the ResultRelInfo caching that's in
> ExecGetTriggerResultRel(). The code there tries looking into the
> estate's es_opened_result_relations, es_tuple_routing_result_relations
> and es_trig_target_relations Lists to see if the ResultRelInfo was
> created before. In the problem case the ResultRelInfo is found in
> es_trig_target_relations and unfortunately it's been set up by some
> code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
> This means when ExecGetTriggerResultRel() is called again this time
> passing the correct rootRelInfo, the cached one that has the NULL
> ri_RootResultRelInfo is found and returned.
>
> This results in the ExecGetChildToRootMap() code doing the wrong thing
> because it sees a NULL ri_RootResultRelInfo therefore does not
> translate the slot into the slow format of the partitioned table.
>
> I've attached a patch which fixes the problem. I'm just not sure if
> it's the right fix for the problem. I suspect the real problem is down
> to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
> in the first place. I just don't see a good way to figure out what the
> parent table should be so we know to create a parent ResultRelInfo as
> the trigger that is firing is for the partition, not the partitioned
> table. I don't see any way to figure out that the trigger is being
> fired because it's cascading an update of its parent partitioned
> table... At a guess, it feels like there might be some fields missing
> in AfterTriggerShared to figure this out.
>
> Any thoughts on this Amit?

Thanks for the off-list heads-up, David.

I'll need to play around with Dmitry's test case a bit before I can be
sure, but I agree with your suspicion -- something's not right if a
result rel without a root rel set ends up in a path where tuple
conversion matters.

--
Thanks, Amit Langote