Обсуждение: inconsistent tableoid handling in COPY WHERE clause

Поиск
Список
Период
Сортировка

inconsistent tableoid handling in COPY WHERE clause

От
Peter Eisentraut
Дата:
I think the COPY FROM WHERE clause is handling the tableoid column in a 
way that is inconsistent with its usual definition.

Consider a partitioning hierarchy like

DROP TABLE IF EXISTS xp;

CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);

Then you can use tableoid in a trigger to reveal the actual partition 
that an inserted row ended up in:

CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
AS
$$
BEGIN
   RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
   RETURN NEW;
END
$$;

CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();

INSERT INTO xp VALUES (1, 11), (2, 22);

You can also write a check constraint that references tableoid to check 
what partition a row ends up in:

ALTER TABLE xp ADD CONSTRAINT xpc
   CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));

INSERT INTO xp VALUES (3, 33);  -- error: violates check constraint

So far so good.

You can also refer to tableoid in the WHERE clause of a COPY command, 
but that doesn't work correctly:

COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
1    111
2    222
\.

I would have expected that to copy only rows that are targeted for the 
xp2 partition, but in fact it does not copy anything.

This works:

COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
1    111
2    222
\.

because tableoid in fact refers to the table named in the command, not 
the actual target partition.

That seems incorrect to me.

I don't see this documented one way or another, but there is a code 
comment in copyfrom.c at least mentioning tableoid in the context of the 
COPY WHERE clause:

     /*
      * Constraints and where clause might reference the tableoid column,
      * so (re-)initialize tts_tableOid before evaluating them.
      */
     myslot->tts_tableOid = 
RelationGetRelid(target_resultRelInfo->ri_RelationDesc);

This comment appeared in a not-quite-related refactoring (commit 
86b85044e82), it only said "Constraints might ..." beforehand.

Even the "Constraints might ..." variant of this is dubious, since as 
shown above, check constraints do have partition awareness, and there 
are places in nodeModifyTable.c that initialize tts_tableOid for that 
purpose.  Are there other constraints where tableoid can be used (and 
where this way of initializing it doesn't result in wrong behavior)?

I suggest that we should prohibit using tableoid in COPY WHERE clauses 
for the time being.  I don't know if there would be a way to make them 
work correctly at all, but most likely not in a backpatchable way anyway.

I also suggest that the above piece of code assigning tts_tableOid 
should be changed somehow.  Maybe just delete it, or set it to 
InvalidOid, because as it is it's misleading and probably wrong.




Re: inconsistent tableoid handling in COPY WHERE clause

От
Kirill Reshke
Дата:
On Wed, 5 Nov 2025 at 16:06, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I think the COPY FROM WHERE clause is handling the tableoid column in a
> way that is inconsistent with its usual definition.
>
> Consider a partitioning hierarchy like
>
> DROP TABLE IF EXISTS xp;
>
> CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
> CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
> CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
> CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);
>
> Then you can use tableoid in a trigger to reveal the actual partition
> that an inserted row ended up in:
>
> CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
> AS
> $$
> BEGIN
>    RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
>    RETURN NEW;
> END
> $$;
>
> CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();
>
> INSERT INTO xp VALUES (1, 11), (2, 22);
>
> You can also write a check constraint that references tableoid to check
> what partition a row ends up in:
>
> ALTER TABLE xp ADD CONSTRAINT xpc
>    CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));
>
> INSERT INTO xp VALUES (3, 33);  -- error: violates check constraint
>
> So far so good.
>
> You can also refer to tableoid in the WHERE clause of a COPY command,
> but that doesn't work correctly:
>
> COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
> 1       111
> 2       222
> \.
>
> I would have expected that to copy only rows that are targeted for the
> xp2 partition, but in fact it does not copy anything.
>
> This works:
>
> COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
> 1       111
> 2       222
> \.
>
> because tableoid in fact refers to the table named in the command, not
> the actual target partition.
>
> That seems incorrect to me.
>
> I don't see this documented one way or another, but there is a code
> comment in copyfrom.c at least mentioning tableoid in the context of the
> COPY WHERE clause:
>
>      /*
>       * Constraints and where clause might reference the tableoid column,
>       * so (re-)initialize tts_tableOid before evaluating them.
>       */
>      myslot->tts_tableOid =
> RelationGetRelid(target_resultRelInfo->ri_RelationDesc);
>
> This comment appeared in a not-quite-related refactoring (commit
> 86b85044e82), it only said "Constraints might ..." beforehand.
>
> Even the "Constraints might ..." variant of this is dubious, since as
> shown above, check constraints do have partition awareness, and there
> are places in nodeModifyTable.c that initialize tts_tableOid for that
> purpose.  Are there other constraints where tableoid can be used (and
> where this way of initializing it doesn't result in wrong behavior)?
>
> I suggest that we should prohibit using tableoid in COPY WHERE clauses
> for the time being.  I don't know if there would be a way to make them
> work correctly at all, but most likely not in a backpatchable way anyway.
>
> I also suggest that the above piece of code assigning tts_tableOid
> should be changed somehow.  Maybe just delete it, or set it to
> InvalidOid, because as it is it's misleading and probably wrong.
>
>
>

Hi!
Looks like this issue is currently discussed at [0]. Should we
continue here or there?



[0] https://www.postgresql.org/message-id/CACJufxHGGc25a2m%2B3Dezfctuykn51n5k3FJK6w3KSqfSFc5gvQ%40mail.gmail.com

-- 
Best regards,
Kirill Reshke



Re: inconsistent tableoid handling in COPY WHERE clause

От
Peter Eisentraut
Дата:
On 05.11.25 12:13, Kirill Reshke wrote:
> Looks like this issue is currently discussed at [0]. Should we
> continue here or there?
> 
> [0]https://www.postgresql.org/message-id/ 
> CACJufxHGGc25a2m%2B3Dezfctuykn51n5k3FJK6w3KSqfSFc5gvQ%40mail.gmail.com

I'm aware of that thread.  I started this new thread to get more focus 
on this particular sub-issue.




Re: inconsistent tableoid handling in COPY WHERE clause

От
Masahiko Sawada
Дата:
On Wed, Nov 5, 2025 at 3:06 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I think the COPY FROM WHERE clause is handling the tableoid column in a
> way that is inconsistent with its usual definition.
>
> Consider a partitioning hierarchy like
>
> DROP TABLE IF EXISTS xp;
>
> CREATE TABLE xp (a int, b int) PARTITION BY LIST (a);
> CREATE TABLE xp1 PARTITION OF xp FOR VALUES IN (1);
> CREATE TABLE xp2 PARTITION OF xp FOR VALUES IN (2);
> CREATE TABLE xp3 PARTITION OF xp FOR VALUES IN (3);
>
> Then you can use tableoid in a trigger to reveal the actual partition
> that an inserted row ended up in:
>
> CREATE OR REPLACE FUNCTION tf() RETURNS TRIGGER LANGUAGE plpgsql
> AS
> $$
> BEGIN
>    RAISE NOTICE 'new.tableoid = %', NEW.tableoid;
>    RETURN NEW;
> END
> $$;
>
> CREATE TRIGGER tg1 AFTER INSERT ON xp FOR EACH ROW EXECUTE FUNCTION tf();
>
> INSERT INTO xp VALUES (1, 11), (2, 22);
>
> You can also write a check constraint that references tableoid to check
> what partition a row ends up in:
>
> ALTER TABLE xp ADD CONSTRAINT xpc
>    CHECK (tableoid IN ('xp1'::regclass, 'xp2'::regclass));
>
> INSERT INTO xp VALUES (3, 33);  -- error: violates check constraint
>
> So far so good.
>
> You can also refer to tableoid in the WHERE clause of a COPY command,
> but that doesn't work correctly:
>
> COPY xp FROM STDIN WHERE tableoid = 'xp2'::regclass;
> 1       111
> 2       222
> \.
>
> I would have expected that to copy only rows that are targeted for the
> xp2 partition, but in fact it does not copy anything.
>
> This works:
>
> COPY xp FROM STDIN WHERE tableoid = 'xp'::regclass;
> 1       111
> 2       222
> \.
>
> because tableoid in fact refers to the table named in the command, not
> the actual target partition.
>
> That seems incorrect to me.
>
> I don't see this documented one way or another, but there is a code
> comment in copyfrom.c at least mentioning tableoid in the context of the
> COPY WHERE clause:
>
>      /*
>       * Constraints and where clause might reference the tableoid column,
>       * so (re-)initialize tts_tableOid before evaluating them.
>       */
>      myslot->tts_tableOid =
> RelationGetRelid(target_resultRelInfo->ri_RelationDesc);
>
> This comment appeared in a not-quite-related refactoring (commit
> 86b85044e82), it only said "Constraints might ..." beforehand.
>
> Even the "Constraints might ..." variant of this is dubious, since as
> shown above, check constraints do have partition awareness, and there
> are places in nodeModifyTable.c that initialize tts_tableOid for that
> purpose.  Are there other constraints where tableoid can be used (and
> where this way of initializing it doesn't result in wrong behavior)?

FYI we can specify system columns in the WITH CHECK clause in CREATE
POLICY, and it seems to work fine (COPY FROM is not supported with RLS
so I tested using with).

>
> I suggest that we should prohibit using tableoid in COPY WHERE clauses
> for the time being.  I don't know if there would be a way to make them
> work correctly at all, but most likely not in a backpatchable way anyway.
>
> I also suggest that the above piece of code assigning tts_tableOid
> should be changed somehow.  Maybe just delete it, or set it to
> InvalidOid, because as it is it's misleading and probably wrong.

Probably should we filter rows by WHERE clause after determining the
partition to insert the tuple into? Currently, after getting a row
from NextCopyFrom(), we check the WHERE clause and then determine the
partition to insert the tuple into if the partitioned table is
specified in the COPY FROM. Then, we set the slot's tableoid to the
right leaf partition's oid:

               /* ensure that triggers etc see the right relation  */
               myslot->tts_tableOid =
RelationGetRelid(resultRelInfo->ri_RelationDesc);

Therefore, triggers and constraints can refer to the right tableoid.
For example, I tested it with the following check constraint:

ALTER TABLE xp ADD CONSTRAINT xpc CHECK (tableoid IN ('xp1'::regclass,
'xp2'::regclass, 'xp'::regclass));

This filters out the row as the tableoid in the where clause refers to
the xp table's oid:

copy xp from stdin where tableoid = 'xp3'::regclass;
3    333
\.

But this raises the check constraint violation error as the tuple
slot's tableoid now refers to xp3 table's oid:

copy xp from stdin;
3    333
\.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: inconsistent tableoid handling in COPY WHERE clause

От
jian he
Дата:
On Thu, Nov 6, 2025 at 3:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > I suggest that we should prohibit using tableoid in COPY WHERE clauses
> > for the time being.  I don't know if there would be a way to make them
> > work correctly at all, but most likely not in a backpatchable way anyway.
> >
> > I also suggest that the above piece of code assigning tts_tableOid
> > should be changed somehow.  Maybe just delete it, or set it to
> > InvalidOid, because as it is it's misleading and probably wrong.
>
> Probably should we filter rows by WHERE clause after determining the
> partition to insert the tuple into? Currently, after getting a row
> from NextCopyFrom(), we check the WHERE clause and then determine the
> partition to insert the tuple into if the partitioned table is
> specified in the COPY FROM. Then, we set the slot's tableoid to the
> right leaf partition's oid:

I think it is doable.

BEFORE ROW TRIGGER no need to correct slot->tts_tableOid information.
see ExecInsert:
at the beginning of ExecInsert, slot->tts_tableOid == 0
ExecMaterializeSlot won't touch the Tableoid field.
ExecBRInsertTriggers happen before explicitly setting slot->tts_tableOid.

ExecBRInsertTriggers->ExecFetchSlotHeapTuple->tts_virtual_copy_heap_tuple->heap_form_tuple
will set (TriggerData->tg_trigtuple) t_tableOid as InvalidOid
trigger execution function plpgsql_exec_trigger only use
TriggerData->tg_trigtuple, not using TriggerData->tg_trigslot.
That means during ExecBRInsertTriggers execution, the t_tableOid is InvalidOid.

exec_assign_value  have "cannot assign to system column" ERROR check
forbidden us change system column.

CopyMultiInsertInfoFlush->CopyMultiInsertBufferFlush->table_multi_insert
will set the proper tableoid for the slot.
That mean don't need to worry about AFTER ROW TRIGGER,

We can place filtering rows by WHERE clause logic right below ExecFindPartition.
Please check the attached patch.


--
jian
https://www.enterprisedb.com

Вложения