Обсуждение: COPY WHERE clause generated/system column reference
hi. CREATE TABLE gtest0 (a int, b int GENERATED ALWAYS AS (a + 1) VIRTUAL); copy gtest0 from stdin where (b <> 1); 0 \. ERROR: unexpected virtual generated column reference CONTEXT: COPY gtest0, line 1: "0" We need to apply expand_generated_columns_in_expr to the whereClause in DoCopy. However, handling STORED generated columns appears to be less straightforward. currently: ExecQual(cstate->qualexpr, econtext)) happen before ExecComputeStoredGenerated. when calling ExecQual, the stored generated column values have not been populated yet. so we may need ExecComputeStoredGenerated beforehand. Since ExecComputeStoredGenerated is likely expensive, I added logic to detect whether the WHERE clause actually have stored generated columns reference before calling it. generated column allow tableoid system column reference, COPY WHERE clause also allow tableoid column reference, should be fine. please check the attached file: v1-0001 fix COPY WHERE with system column reference v1-0002 fix COPY WHERE with generated column reference
Вложения
On Mon, 27 Oct 2025 at 13:21, jian he <jian.universality@gmail.com> wrote: > > hi. > > CREATE TABLE gtest0 (a int, b int GENERATED ALWAYS AS (a + 1) VIRTUAL); > copy gtest0 from stdin where (b <> 1); > 0 > \. > > ERROR: unexpected virtual generated column reference > CONTEXT: COPY gtest0, line 1: "0" > > We need to apply expand_generated_columns_in_expr to the whereClause in DoCopy. > However, handling STORED generated columns appears to be less straightforward. > > currently: > ExecQual(cstate->qualexpr, econtext)) > happen before > ExecComputeStoredGenerated. > > when calling ExecQual, the stored generated column values have not been > populated yet. so we may need ExecComputeStoredGenerated beforehand. > Since ExecComputeStoredGenerated is likely expensive, I added logic to detect > whether the WHERE clause actually have stored generated columns reference before > calling it. > > generated column allow tableoid system column reference, COPY WHERE clause also > allow tableoid column reference, should be fine. > > please check the attached file: > v1-0001 fix COPY WHERE with system column reference > v1-0002 fix COPY WHERE with generated column reference Hi! Indeed, copying from with generated column in where clause is broken on HEAD. I applied your patches, they indeed fix the issue. Small comment: in 0002: > + if (has_stored_generated) > + ExecComputeStoredGenerated(resultRelInfo, estate, myslot, > + CMD_INSERT); Should we use CMD_UTILITY here? Comment in nodes.h suggests so. Also, ExecComputeStoredGenerated only check for equality with CMD_UPDATE, so this is just a cosmetic change. -- Best regards, Kirill Reshke
On Tue, Oct 28, 2025 at 2:02 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Small comment: in 0002:
>
> > + if (has_stored_generated)
> > + ExecComputeStoredGenerated(resultRelInfo, estate, myslot,
> > +   CMD_INSERT);
>
> Should we use CMD_UTILITY here? Comment in nodes.h suggests so. Also,
> ExecComputeStoredGenerated only check for equality with CMD_UPDATE, so
> this is just a cosmetic change.
>
hi.
use CMD_UTILITY will also work as expected.
ExecComputeStoredGenerated expects the command type (cmdtype) to be either
UPDATE or INSERT.
in ExecComputeStoredGenerated, we have:
    if (cmdtype == CMD_UPDATE)
    else
    {
        if (resultRelInfo->ri_GeneratedExprsI == NULL)
            ExecInitGenerated(resultRelInfo, estate, cmdtype);
        /* Early exit is impossible given the prior Assert */
        Assert(resultRelInfo->ri_NumGeneratedNeededI > 0);
        ri_GeneratedExprs = resultRelInfo->ri_GeneratedExprsI;
    }
in struct ResultRelInfo also has comments like:
    /*
     * Arrays of stored generated columns ExprStates for INSERT/UPDATE/MERGE.
     */
    ExprState **ri_GeneratedExprsI;
    ExprState **ri_GeneratedExprsU;
I think using CMD_INSERT should be fine. Also, note that below
ExecComputeStoredGenerated uses CMD_INSERT too.
			
		On Mon, Oct 27, 2025 at 1:21 AM jian he <jian.universality@gmail.com> wrote:
>
> hi.
>
> CREATE TABLE gtest0 (a int, b int GENERATED ALWAYS AS (a + 1) VIRTUAL);
> copy gtest0 from stdin where (b <> 1);
> 0
> \.
>
> ERROR:  unexpected virtual generated column reference
> CONTEXT:  COPY gtest0, line 1: "0"
>
> We need to apply expand_generated_columns_in_expr to the whereClause in DoCopy.
> However, handling STORED generated columns appears to be less straightforward.
>
> currently:
> ExecQual(cstate->qualexpr, econtext))
> happen before
> ExecComputeStoredGenerated.
>
> when calling ExecQual, the stored generated column values have not been
> populated yet. so we may need ExecComputeStoredGenerated beforehand.
> Since ExecComputeStoredGenerated is likely expensive, I added logic to detect
> whether the WHERE clause actually have stored generated columns reference before
> calling it.
While I agree we can improve the error message in that case as the
message "unexpected virtual generated column reference" isn't helpful
much, I'm not sure that generated column values should be considered
when filtering rows by the WHERE clause. The documentation[1] says:
A stored generated column is computed when it is written (inserted or
updated) and occupies storage as if it were a normal column. A virtual
generated column occupies no storage and is computed when it is read.
The proposed patch (the 0002 patch) allows COPY FROM ... WHERE to
filter rows by checking tuples including generated column values but
it's somewhat odd as it seems not to be the time of reading tuples
from a table.
Also, the patch calls ExecComputeStoredGenerated() before ExecQual(),
which is also before we trigger the BEFORE INSERT trigger. It clearly
violates what the documentation describes[1]:
Generated columns are, conceptually, updated after BEFORE triggers
have run. Therefore, changes made to base columns in a BEFORE trigger
will be reflected in generated columns. But conversely, it is not
allowed to access generated columns in BEFORE triggers.
For example, the tuples passed to a BEFORE INSERT trigger varies
depending on the WHERE clause as follows:
-- preparation
create table t (a int, s int generated always as (a + 10) stored);
create table tt (a int, s int);
create function trig_fn() returns trigger as
$$
begin
    insert into tt select NEW.*;
    return NEW;
end;
$$ language plpgsql;
create trigger trig before insert on t for each row execute function trig_fn();
-- copy a row without the WHERE clause.
copy t from program 'echo 1';
table tt;
 a | s
---+---
 1 |
(1 row)
-- copy a row with the where clause
copy t from program 'echo 1' where s > 0;
table tt;
 a | s
---+----
 1 |
 1 | 11
> generated column allow tableoid system column reference, COPY WHERE clause also
> allow tableoid column reference, should be fine.
>
> please check the attached file:
> v1-0001 fix COPY WHERE with system column reference
It seems to make sense to disallow users to specify system columns in
the WHERE clause of COPY FROM. But why do we need to have an exception
for tableoid? In the context of COPY FROM, specifying tableoid doesn't
not make sense to me as tuples don't come from any relations. If we
accept tableoid, I think it's better to explain why here.
Regards,
[1] https://www.postgresql.org/docs/devel/ddl-generated-columns.html#DDL-GENERATED-COLUMNS
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
			
		On Tue, Nov 4, 2025 at 8:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> The proposed patch (the 0002 patch) allows COPY FROM ... WHERE to
> filter rows by checking tuples including generated column values but
> it's somewhat odd as it seems not to be the time of reading tuples
> from a table.
>
> Also, the patch calls ExecComputeStoredGenerated() before ExecQual(),
> which is also before we trigger the BEFORE INSERT trigger. It clearly
> violates what the documentation describes[1]:
>
> For example, the tuples passed to a BEFORE INSERT trigger varies
> depending on the WHERE clause as follows:
>
> -- preparation
> create table t (a int, s int generated always as (a + 10) stored);
> create table tt (a int, s int);
> create function trig_fn() returns trigger as
> $$
> begin
>     insert into tt select NEW.*;
>     return NEW;
> end;
> $$ language plpgsql;
> create trigger trig before insert on t for each row execute function trig_fn();
>
> -- copy a row without the WHERE clause.
> copy t from program 'echo 1';
> table tt;
>  a | s
> ---+---
>  1 |
> (1 row)
>
> -- copy a row with the where clause
> copy t from program 'echo 1' where s > 0;
> table tt;
>  a | s
> ---+----
>  1 |
>  1 | 11
>
> > generated column allow tableoid system column reference, COPY WHERE clause also
> > allow tableoid column reference, should be fine.
> >
for virtual generated column, adding
``whereClause = expand_generated_columns_in_expr(whereClause, rel, 1);``
should be able to solve the problem.
For stored generated columns, we can either
A. document that the stored generated column is not yet computed, it
will be NULL
B. error out if the WHERE clause has a stored generated column.
C. add a temp slot and the computed stored generated column value
stored in the temp slot.
attached v2-0003 using option C to address this problem.
> > please check the attached file:
> > v1-0001 fix COPY WHERE with system column reference
>
> It seems to make sense to disallow users to specify system columns in
> the WHERE clause of COPY FROM. But why do we need to have an exception
> for tableoid? In the context of COPY FROM, specifying tableoid doesn't
> not make sense to me as tuples don't come from any relations. If we
> accept tableoid, I think it's better to explain why here.
>
In function CopyFrom, we have below comment, which indicates
At that time, tableoid was considered in the WHERE clause.
        /*
         * Constraints and where clause might reference the tableoid column,
         * so (re-)initialize tts_tableOid before evaluating them.
         */
        myslot->tts_tableOid =
RelationGetRelid(target_resultRelInfo->ri_RelationDesc);
Another possible reason:
tableoid can be referenced in virtual generated column expression.
COPY WHERE clause can be supported for virtual general columns.
CREATE TABLE gtest4 (a int, b oid GENERATED ALWAYS AS ((tableoid)));
COPY gtest4 from stdin where b <> 26420;
COPY gtest4 from stdin where tableoid <> 26420;
we should expect the above two COPY statements behave the same.
please check the attached file:
v2-0001: fix COPY WHERE with system column reference
v2-0002: fix COPY WHERE with virtual generated column reference
v2-0003: fix COPY WHERE with stored generated column reference (experimental)