Обсуждение: Problem while updating a foreign table pointing to a partitionedtable on foreign server

Поиск
Список
Период
Сортировка

Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
Hi,
Consider this scenario

postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a);
postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1);
postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2);
postgres=# INSERT INTO plt VALUES (1, 1), (2, 2);
postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback
OPTIONS (table_name 'plt');
postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
 tableoid | ctid  | a | b
----------+-------+---+---
 fplt     | (0,1) | 1 | 1
 fplt     | (0,1) | 2 | 2
(2 rows)

-- Need to use random() so that following update doesn't turn into a
direct UPDATE.
postgres=# EXPLAIN (VERBOSE, COSTS OFF)
postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
20 END) WHERE a = 1;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Update on public.fplt
   Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1
   ->  Foreign Scan on public.fplt
         Output: a, CASE WHEN (random() <= '1'::double precision) THEN
10 ELSE 20 END, ctid
         Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE
(5 rows)

postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
20 END) WHERE a = 1;
postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
 tableoid | ctid  | a | b
----------+-------+---+----
 fplt     | (0,2) | 1 | 10
 fplt     | (0,2) | 2 | 10
(2 rows)

We expect only 1 row with a = 1 to be updated, but both the rows get
updated. This happens because both the rows has ctid = (0, 1) and
that's the only qualification used for UPDATE and DELETE. Thus when a
non-direct UPDATE is run on a foreign table which points to a
partitioned table or inheritance hierarchy on the foreign server, it
will update rows from all child table which have ctids same as the
qualifying rows. Same is the case with DELETE.

There are two ways to fix this
1. Use WHERE CURRENT OF with cursors to update rows. This means that
we fetch only one row at a time and update it. This can slow down the
execution drastically.
2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.

PFA patch along the lines of 2nd approach and along with the
testcases. The idea is to inject tableoid attribute to be fetched from
the foreign server similar to ctid and then add it to the DML
statement being constructed.

It does fix the problem. But the patch as is interferes with the way
we handle tableoid currently. That can be seen from the regression
diffs that the patch causes.  RIght now, every tableoid reference gets
converted into the tableoid of the foreign table (and not the tableoid
of the foreign table). Somehow we need to differentiate between the
tableoid injected for DML and tableoid references added by the user in
the original query and then use tableoid on the foreign server for the
first and local foreign table's oid for the second. Right now, I don't
see a simple way to do that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Mon, 16 Apr 2018 17:05:28 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS+OxcQo=aBDn1COywmcg@mail.gmail.com>
> Hi,
> Consider this scenario
> 
> postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a);
> postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1);
> postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2);
> postgres=# INSERT INTO plt VALUES (1, 1), (2, 2);
> postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback
> OPTIONS (table_name 'plt');
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
>  tableoid | ctid  | a | b
> ----------+-------+---+---
>  fplt     | (0,1) | 1 | 1
>  fplt     | (0,1) | 2 | 2
> (2 rows)
> 
> -- Need to use random() so that following update doesn't turn into a
> direct UPDATE.
> postgres=# EXPLAIN (VERBOSE, COSTS OFF)
> postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>  Update on public.fplt
>    Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1
>    ->  Foreign Scan on public.fplt
>          Output: a, CASE WHEN (random() <= '1'::double precision) THEN
> 10 ELSE 20 END, ctid
>          Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE
> (5 rows)
> 
> postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
>  tableoid | ctid  | a | b
> ----------+-------+---+----
>  fplt     | (0,2) | 1 | 10
>  fplt     | (0,2) | 2 | 10
> (2 rows)
> 
> We expect only 1 row with a = 1 to be updated, but both the rows get
> updated. This happens because both the rows has ctid = (0, 1) and
> that's the only qualification used for UPDATE and DELETE. Thus when a
> non-direct UPDATE is run on a foreign table which points to a
> partitioned table or inheritance hierarchy on the foreign server, it
> will update rows from all child table which have ctids same as the
> qualifying rows. Same is the case with DELETE.

Anyway I think we should warn or error out if one nondirect
update touches two nor more tuples in the first place.

=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1;
ERROR:  updated 2 rows for a tuple identity on the remote end


> There are two ways to fix this
> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
> we fetch only one row at a time and update it. This can slow down the
> execution drastically.
> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
> 
> PFA patch along the lines of 2nd approach and along with the
> testcases. The idea is to inject tableoid attribute to be fetched from
> the foreign server similar to ctid and then add it to the DML
> statement being constructed.
> 
> It does fix the problem. But the patch as is interferes with the way
> we handle tableoid currently. That can be seen from the regression
> diffs that the patch causes.  RIght now, every tableoid reference gets
> converted into the tableoid of the foreign table (and not the tableoid
> of the foreign table). Somehow we need to differentiate between the
> tableoid injected for DML and tableoid references added by the user in
> the original query and then use tableoid on the foreign server for the
> first and local foreign table's oid for the second. Right now, I don't
> see a simple way to do that.

We cannot add no non-system (junk) columns not defined in foreign
table columns. We could pass tableoid via a side channel but we
get wrong value if the scan is not consists of only one foreign
relation. I don't think adding remote_tableoid in HeapTupleData
is acceptable. Explicity defining remote_tableoid column in
foreign relation might work but it makes things combersome..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index e1c2639fde..7cd31cb6ab 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1895,6 +1895,13 @@ postgresExecForeignUpdate(EState *estate,
 
     MemoryContextReset(fmstate->temp_cxt);
 
+    /* ERROR if more than one row was updated on the remote end */
+    if (n_rows > 1)
+        ereport(ERROR,
+                (errcode (ERRCODE_FDW_ERROR), /* XXX */
+                 errmsg ("updated %d rows for a tuple identity on the remote end",
+                         n_rows)));
+
     /* Return NULL if nothing was updated on the remote end */
     return (n_rows > 0) ? slot : NULL;
 }

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Wed, Apr 18, 2018 at 9:43 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> Anyway I think we should warn or error out if one nondirect
> update touches two nor more tuples in the first place.
>
> =# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1;
> ERROR:  updated 2 rows for a tuple identity on the remote end

I liked that idea. But I think your patch wasn't quite right, esp.
when the returning had an SRF in it. Right now n_rows tracks the
number of rows returned if there is a returning list or the number of
rows updated/deleted on the foreign server. If there is an SRF, n_rows
can return multiple rows for a single updated or deleted row. So, I
changed your code to track number of rows updated/deleted and number
of rows returned separately. BTW, your patch didn't handle DELETE
case.

I have attached a set of patches
0001 adds a test case showing the issue.
0002 modified patch based on your idea of throwing an error
0003 WIP patch with a partial fix for the issue as discussed upthread

The expected output in 0001 is set to what it would when the problem
gets fixed. The expected output in 0002 is what it would be when we
commit only 0002 without a complete fix.
>
>
>> There are two ways to fix this
>> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
>> we fetch only one row at a time and update it. This can slow down the
>> execution drastically.
>> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
>> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
>>
>> PFA patch along the lines of 2nd approach and along with the
>> testcases. The idea is to inject tableoid attribute to be fetched from
>> the foreign server similar to ctid and then add it to the DML
>> statement being constructed.
>>
>> It does fix the problem. But the patch as is interferes with the way
>> we handle tableoid currently. That can be seen from the regression
>> diffs that the patch causes.  RIght now, every tableoid reference gets
>> converted into the tableoid of the foreign table (and not the tableoid
>> of the foreign table). Somehow we need to differentiate between the
>> tableoid injected for DML and tableoid references added by the user in
>> the original query and then use tableoid on the foreign server for the
>> first and local foreign table's oid for the second. Right now, I don't
>> see a simple way to do that.
>
> We cannot add no non-system (junk) columns not defined in foreign
> table columns.

Why? That's a probable way of fixing this problem.

> We could pass tableoid via a side channel but we
> get wrong value if the scan is not consists of only one foreign
> relation. I don't think adding remote_tableoid in HeapTupleData
> is acceptable.

I am thinking of adding remote_tableoid in HeapTupleData since not all
FDWs will have the concept of tableoid. But we need to somehow
distinguish the tableoid resjunk added for DMLs and tableoid requested
by the user.

> Explicity defining remote_tableoid column in
> foreign relation might work but it makes things combersome..
>

Not just cumbersome, it's not going to be always right, if the things
change on the foreign server e.g. OID of the table changes because it
got dropped and recreated on the foreign server or OID remained same
but the table got inherited and so on.

I think we should try getting 0001 and 0002 at least committed
independent of 0003.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
At Wed, 18 Apr 2018 13:23:06 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpReWZnJ_raxAroaxb3_uRVpxnrnh8w3BjKs0kgy0Ya2+kA@mail.gmail.com>
> On Wed, Apr 18, 2018 at 9:43 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >
> > Anyway I think we should warn or error out if one nondirect
> > update touches two nor more tuples in the first place.
> >
> > =# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1;
> > ERROR:  updated 2 rows for a tuple identity on the remote end
> 
> I liked that idea. But I think your patch wasn't quite right, esp.
> when the returning had an SRF in it. Right now n_rows tracks the
> number of rows returned if there is a returning list or the number of
> rows updated/deleted on the foreign server. If there is an SRF, n_rows
> can return multiple rows for a single updated or deleted row. So, I
> changed your code to track number of rows updated/deleted and number
> of rows returned separately. BTW, your patch didn't handle DELETE
> case.

Yeah, sorry. It was to just show how the error looks
like. Attached 0002 works and looks fine except the following.

> /* No rows should be returned if no rows were updated. */
> Assert(n_rows_returned == 0 || n_rows_updated > 0);

The assertion is correct but I think that we shouldn't crash
server by any kind of protocol error. I think ERROR is suitable.

> I have attached a set of patches
> 0001 adds a test case showing the issue.
> 0002 modified patch based on your idea of throwing an error
> 0003 WIP patch with a partial fix for the issue as discussed upthread
> 
> The expected output in 0001 is set to what it would when the problem
> gets fixed. The expected output in 0002 is what it would be when we
> commit only 0002 without a complete fix.
> >
> >
> >> There are two ways to fix this
> >> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
> >> we fetch only one row at a time and update it. This can slow down the
> >> execution drastically.
> >> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
> >> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
> >>
> >> PFA patch along the lines of 2nd approach and along with the
> >> testcases. The idea is to inject tableoid attribute to be fetched from
> >> the foreign server similar to ctid and then add it to the DML
> >> statement being constructed.
> >>
> >> It does fix the problem. But the patch as is interferes with the way
> >> we handle tableoid currently. That can be seen from the regression
> >> diffs that the patch causes.  RIght now, every tableoid reference gets
> >> converted into the tableoid of the foreign table (and not the tableoid
> >> of the foreign table). Somehow we need to differentiate between the
> >> tableoid injected for DML and tableoid references added by the user in
> >> the original query and then use tableoid on the foreign server for the
> >> first and local foreign table's oid for the second. Right now, I don't
> >> see a simple way to do that.
> >
> > We cannot add no non-system (junk) columns not defined in foreign
> > table columns.
> 
> Why? That's a probable way of fixing this problem.

In other words, tuples returned from ForeignNext
(postgresIterateForeignScan) on a foreign (base) relation cannot
contain a non-system column which is not a part of the relation,
since its tuple descriptor doesn't know of and does error out it.
The current 0003 stores remote tableoid in tuples' existing
tableOid field (not a column data), which is not proper since
remote tableoid is bogus for the local server. I might missing
something here, though. If we can somehow attach an blob at the
end of t_data and it is finally passed to
ExecForeignUpdate/Delete, the problem would be resolved.

> > We could pass tableoid via a side channel but we
> > get wrong value if the scan is not consists of only one foreign
> > relation. I don't think adding remote_tableoid in HeapTupleData
> > is acceptable.
> 
> I am thinking of adding remote_tableoid in HeapTupleData since not all
> FDWs will have the concept of tableoid. But we need to somehow
> distinguish the tableoid resjunk added for DMLs and tableoid requested
> by the user.

I don't think it is acceptable but (hopefully) almost solves this
problem if we allow that. User always sees the conventional
tableOid and all ExecForeignUpdate/Delete have to do is to use
remote_tableoid as a part of remote tuple identifier. Required to
consider how to propagate the remote_tableoid through joins or
other intermediate executor nodes, though. It is partly similar
to the way deciding join push down.

Another point is that, even though HeapTupleData is the only
expected coveyer of the tuple identification, assuming tableoid +
ctid is not adequite since FDW interface is not exlusive for
postgres_fdw. The existig ctid field is not added for the purpose
and just happened to (seem to) work as tuple identifier for
postgres_fdw but I think tableoid is not.

> > Explicity defining remote_tableoid column in
> > foreign relation might work but it makes things combersome..
> >
> 
> Not just cumbersome, it's not going to be always right, if the things
> change on the foreign server e.g. OID of the table changes because it
> got dropped and recreated on the foreign server or OID remained same
> but the table got inherited and so on.

The same can be said on ctid. Maybe my description was
unclear. Specifically, I intended to say something like:

- If we want to update/delete remote partitioned/inhtance tables
  without direct modify, the foreign relation must have a columns
  defined as "tableoid as remote_tableoid" or something. (We
  could change the column name by a fdw option.)

- ForeignScan for TableModify adds "remote_tableoid" instead of
  tableoid to receive remote tableoid and returns it as a part of
  a ordinary return tuple.

- ForeignUpdate/Delete sees the remote_tableoid instead of
  tuple's tableOid field.

Yes, it is dreadfully bad interface, especially it is not
guaranteed to be passed to modify side if users don't write a
query to do so. So, yes, the far bad than cumbersome.


> I think we should try getting 0001 and 0002 at least committed
> independent of 0003.

Agreed on 0002. 0001 should be committed with 0003?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Thu, Apr 19, 2018 at 11:38 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
>> /* No rows should be returned if no rows were updated. */
>> Assert(n_rows_returned == 0 || n_rows_updated > 0);
>
> The assertion is correct but I think that we shouldn't crash
> server by any kind of protocol error. I think ERROR is suitable.
>

That's a good idea. Done.

>> I have attached a set of patches
>> 0001 adds a test case showing the issue.
>> 0002 modified patch based on your idea of throwing an error
>> 0003 WIP patch with a partial fix for the issue as discussed upthread
>>
>> The expected output in 0001 is set to what it would when the problem
>> gets fixed. The expected output in 0002 is what it would be when we
>> commit only 0002 without a complete fix.
>> >
>> >
>> >> There are two ways to fix this
>> >> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
>> >> we fetch only one row at a time and update it. This can slow down the
>> >> execution drastically.
>> >> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
>> >> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
>> >>
>> >> PFA patch along the lines of 2nd approach and along with the
>> >> testcases. The idea is to inject tableoid attribute to be fetched from
>> >> the foreign server similar to ctid and then add it to the DML
>> >> statement being constructed.
>> >>
>> >> It does fix the problem. But the patch as is interferes with the way
>> >> we handle tableoid currently. That can be seen from the regression
>> >> diffs that the patch causes.  RIght now, every tableoid reference gets
>> >> converted into the tableoid of the foreign table (and not the tableoid
>> >> of the foreign table). Somehow we need to differentiate between the
>> >> tableoid injected for DML and tableoid references added by the user in
>> >> the original query and then use tableoid on the foreign server for the
>> >> first and local foreign table's oid for the second. Right now, I don't
>> >> see a simple way to do that.
>> >
>> > We cannot add no non-system (junk) columns not defined in foreign
>> > table columns.
>>
>> Why? That's a probable way of fixing this problem.
>
> In other words, tuples returned from ForeignNext
> (postgresIterateForeignScan) on a foreign (base) relation cannot
> contain a non-system column which is not a part of the relation,
> since its tuple descriptor doesn't know of and does error out it.
> The current 0003 stores remote tableoid in tuples' existing
> tableOid field (not a column data), which is not proper since
> remote tableoid is bogus for the local server. I might missing
> something here, though. If we can somehow attach an blob at the
> end of t_data and it is finally passed to
> ExecForeignUpdate/Delete, the problem would be resolved.

Attached 0003 uses HeapTupleData::t_tableoid to store remote tableoid
and local tableoid. Remote tableoid is stored there for a scan
underlying DELETE/UPDATE. Local tableoid is stored otherwise. We use a
flag fetch_foreign_tableoid, stand alone and in deparse_expr_cxt to
differentiate between these two usages.

>
> I don't think it is acceptable but (hopefully) almost solves this
> problem if we allow that. User always sees the conventional
> tableOid and all ExecForeignUpdate/Delete have to do is to use
> remote_tableoid as a part of remote tuple identifier. Required to
> consider how to propagate the remote_tableoid through joins or
> other intermediate executor nodes, though. It is partly similar
> to the way deciding join push down.

0003 does that. Fortunately we already have testing UPDATE/DELETE with joins.

>
> Another point is that, even though HeapTupleData is the only
> expected coveyer of the tuple identification, assuming tableoid +
> ctid is not adequite since FDW interface is not exlusive for
> postgres_fdw. The existig ctid field is not added for the purpose
> and just happened to (seem to) work as tuple identifier for
> postgres_fdw but I think tableoid is not.

I am not able to understand. postgresAddForeignUpdateTargets does that
specifically for postgres_fdw. I am using the same function to add
junk column for tableoid similar to ctid.

>
> The same can be said on ctid. Maybe my description was
> unclear. Specifically, I intended to say something like:
>
> - If we want to update/delete remote partitioned/inhtance tables
>   without direct modify, the foreign relation must have a columns
>   defined as "tableoid as remote_tableoid" or something. (We
>   could change the column name by a fdw option.)

Ok. I think, I misunderstood your proposal. IIUC, this way, SELECT *
FROM foreign_table is going to report remote_tableoid, which won't be
welcome by users.

Let me know what you think of the attached patches.

>
>
>> I think we should try getting 0001 and 0002 at least committed
>> independent of 0003.
>
> Agreed on 0002. 0001 should be committed with 0003?

0001 adds testcases which show the problem, so we have to commit it
with 0003 or 0002.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения
On Mon, Apr 16, 2018 at 7:35 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> It does fix the problem. But the patch as is interferes with the way
> we handle tableoid currently. That can be seen from the regression
> diffs that the patch causes.  RIght now, every tableoid reference gets
> converted into the tableoid of the foreign table (and not the tableoid
> of the foreign table). Somehow we need to differentiate between the
> tableoid injected for DML and tableoid references added by the user in
> the original query and then use tableoid on the foreign server for the
> first and local foreign table's oid for the second. Right now, I don't
> see a simple way to do that.

I think that the place to start would be to change this code to use
something other than TableOidAttributeNumber:

+       var = makeVar(parsetree->resultRelation,
+                                 TableOidAttributeNumber,
+                                 OIDOID,
+                                 -1,
+                                 InvalidOid,
+                                 0);

Note that rewriteTargetListUD, which calls AddForeignUpdateTargets,
also contingently adds a "wholerow" attribute which ExecModifyTable()
is able to fish out later.  It seems like it should be possible to add
a "remotetableoid" column that works similarly, although I'm not
exactly sure what would be involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Wed, May 16, 2018 at 11:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 16, 2018 at 7:35 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> It does fix the problem. But the patch as is interferes with the way
>> we handle tableoid currently. That can be seen from the regression
>> diffs that the patch causes.  RIght now, every tableoid reference gets
>> converted into the tableoid of the foreign table (and not the tableoid
>> of the foreign table). Somehow we need to differentiate between the
>> tableoid injected for DML and tableoid references added by the user in
>> the original query and then use tableoid on the foreign server for the
>> first and local foreign table's oid for the second. Right now, I don't
>> see a simple way to do that.
>
> I think that the place to start would be to change this code to use
> something other than TableOidAttributeNumber:
>
> +       var = makeVar(parsetree->resultRelation,
> +                                 TableOidAttributeNumber,
> +                                 OIDOID,
> +                                 -1,
> +                                 InvalidOid,
> +                                 0);
>
> Note that rewriteTargetListUD, which calls AddForeignUpdateTargets,
> also contingently adds a "wholerow" attribute which ExecModifyTable()
> is able to fish out later.  It seems like it should be possible to add
> a "remotetableoid" column that works similarly, although I'm not
> exactly sure what would be involved.

As of today, all the attributes added by AddForeignUpdateTargets hook
of postgres_fdw are recognised by PostgreSQL. But remotetableoid is
not a recognised attributes. In order to use it, we either have to
define a new system attribute "remotetableoid" or add a user defined
attribute "remotetableoid" in every foreign table. The first one will
be very specific for postgres_fdw and other FDWs won't be able to use
it. The second would mean that SELECT * from foreign table reports
remotetableoid as well, which is awkward. Me and Horiguchi-san
discussed those ideas in this mail thread.

Anyway, my comment to which you have replied is obsolete now. I found
a solution to that problem, which I have implemented in 0003 in the
latest patch-set I have shared.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The second would mean that SELECT * from foreign table reports
> remotetableoid as well, which is awkward.

No it wouldn't.  You'd just make the additional column resjunk, same
as we do for wholerow.

> Anyway, my comment to which you have replied is obsolete now. I found
> a solution to that problem, which I have implemented in 0003 in the
> latest patch-set I have shared.

Yeah, but I'm not sure I like that solution very much.  I don't think
abusing the tableoid to store a remote table OID is very nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> Yeah, but I'm not sure I like that solution very much.  I don't think
> abusing the tableoid to store a remote table OID is very nice.

I'd say it's totally unacceptable.  Tableoid *has to* be something
that you can look up in the local pg_class instance, or serious
confusion will ensue.

            regards, tom lane


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Alvaro Herrera
Дата:
On 2018-May-17, Tom Lane wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
> > Yeah, but I'm not sure I like that solution very much.  I don't think
> > abusing the tableoid to store a remote table OID is very nice.
> 
> I'd say it's totally unacceptable.  Tableoid *has to* be something
> that you can look up in the local pg_class instance, or serious
> confusion will ensue.

Can we just add a new junk attr, with its own fixed system column
number?  I think that's what Robert was proposing.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Can we just add a new junk attr, with its own fixed system column
> number?  I think that's what Robert was proposing.

Junk attr yes, "fixed system column number" no.  That's not how
junk attrs work.  What it'd need is a convention for the name of
these resjunk attrs (corresponding to ctidN, wholerowN, etc).
We do already have tableoidN junk attrs, but by the same token
those should always be local OIDs, or we'll be in for deep
confusion.  Maybe "remotetableoidN" ?

            regards, tom lane


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Thu, May 17, 2018 at 11:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> The second would mean that SELECT * from foreign table reports
>> remotetableoid as well, which is awkward.
>
> No it wouldn't.  You'd just make the additional column resjunk, same
> as we do for wholerow.

You suggested
--
> I think that the place to start would be to change this code to use
> something other than TableOidAttributeNumber:
>
> +       var = makeVar(parsetree->resultRelation,
> +                                 TableOidAttributeNumber,
> +                                 OIDOID,
> +                                 -1,
> +                                 InvalidOid,
> +                                 0);
--

Wholerow has its own attribute number 0, ctid has its attribute number
-1. So we can easily create Vars for those and add resjunk entries in
the targetlist. But a "remotetableoid" doesn't have an attribute
number yet! Either it has to be a new system column, which I and
almost everybody here is opposing, or it has to be a user defined
attribute, with an entry in pg_attributes table. In the second case,
how would one make that column resjunk? I don't see any third
possibility.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
At Fri, 18 May 2018 10:19:30 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRe5KBBXzio-1iCzmH35kxYy90z6ewLU+VPtM0u=kH-ubw@mail.gmail.com>
ashutosh.bapat> On Thu, May 17, 2018 at 11:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapat
> > <ashutosh.bapat@enterprisedb.com> wrote:
> >> The second would mean that SELECT * from foreign table reports
> >> remotetableoid as well, which is awkward.
> >
> > No it wouldn't.  You'd just make the additional column resjunk, same
> > as we do for wholerow.
> 
> You suggested
> --
> > I think that the place to start would be to change this code to use
> > something other than TableOidAttributeNumber:
> >
> > +       var = makeVar(parsetree->resultRelation,
> > +                                 TableOidAttributeNumber,
> > +                                 OIDOID,
> > +                                 -1,
> > +                                 InvalidOid,
> > +                                 0);
> --
> 
> Wholerow has its own attribute number 0, ctid has its attribute number
> -1. So we can easily create Vars for those and add resjunk entries in
> the targetlist. But a "remotetableoid" doesn't have an attribute
> number yet! Either it has to be a new system column, which I and
> almost everybody here is opposing, or it has to be a user defined
> attribute, with an entry in pg_attributes table. In the second case,
> how would one make that column resjunk? I don't see any third
> possibility.

I have reached to the same thought.
  
The point here is that it is a base relation, which is not
assumed to have additional columns not in its definition,
including nonsystem junk columns. I'm not sure but it seems not
that simple to give base relations an ability to have junk
columns.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Fri, May 18, 2018 at 4:29 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I have reached to the same thought.
>
> The point here is that it is a base relation, which is not
> assumed to have additional columns not in its definition,
> including nonsystem junk columns. I'm not sure but it seems not
> that simple to give base relations an ability to have junk
> columns.

Do you know where that assumption is embedded specifically?

If you're correct, then the FDW API is and always has been broken by
design for any remote data source that uses a row identifier other
than CTID, unless every foreign table definition always includes the
row identifier as an explicit column.  I might be wrong here, but I'm
pretty sure Tom wouldn't have committed this API in the first place
with such a glaring hole in the design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
At Fri, 18 May 2018 15:31:07 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoaBuzhhcA21sAm7wH+A-GH2d6GkKhVapkqhnHOW85dDXg@mail.gmail.com>
> On Fri, May 18, 2018 at 4:29 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I have reached to the same thought.
> >
> > The point here is that it is a base relation, which is not
> > assumed to have additional columns not in its definition,
> > including nonsystem junk columns. I'm not sure but it seems not
> > that simple to give base relations an ability to have junk
> > columns.
> 
> Do you know where that assumption is embedded specifically?

Taking the question literally, I see that add_vars_to_targetlist
accepts neither nonsystem (including whole row vars) junk columns
nor nonjunk columns that is not defined in the base relation. The
first line of the following code is that.

> Assert(attno >= rel->min_attr && attno <= rel->max_attr);
> attno -= rel->min_attr;
> if (rel->attr_needed[attno] == NULL)

In the last line attr_needed is of an array of (max_attr -
min_attr) elements, which is allocated in get_relation_info. I
didn't go further so it might be easier than I'm thinking but
anyway core-side modification (seems to me) is required at any
rate.

> If you're correct, then the FDW API is and always has been broken by
> design for any remote data source that uses a row identifier other
> than CTID, unless every foreign table definition always includes the
> row identifier as an explicit column. 

I actually see that. Oracle-FDW needs to compose row
identification by specifying "key" column option in relation
definition and the key columns are added as resjunk column. This
is the third (or, forth?) option of my comment upthread that was
said as "not only bothersome".

https://github.com/laurenz/oracle_fdw

| Column options (from PostgreSQL 9.2 on)
| key (optional, defaults to "false")
| 
| If set to yes/on/true, the corresponding column on the foreign
| Oracle table is considered a primary key column.  For UPDATE and
| DELETE to work, you must set this option on all columns that
| belong to the table's primary key.

>                                      I might be wrong here, but I'm
> pretty sure Tom wouldn't have committed this API in the first place
> with such a glaring hole in the design.

I see the API is still not broken in a sense, the ctid of
postgres_fdw is necessarily that of remote table. If we have a
reasonable mapping between remote tableoid:ctid and local ctid,
it works as expected. But such mapping seems to be rather
difficult to create since I don't find a generic way wihtout
needing auxiliary information, and at least there's no guarantee
that ctid has enough space for rows from multiple tables.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Alvaro Herrera
Дата:
Hello

I don't think this thread has reached a consensus on a design for a fix,
has it?  Does anybody have a clear idea on a path forward?  Is anybody
working on a patch?

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Kyotaro HORIGUCHI
Дата:
Thanks.

> I don't think this thread has reached a consensus on a design for a fix

Right.

If my understanding about non-system junk columns in a base relation
and identifiers of a foreign tuples are correct, what is needed here
is giving base relations the ability to have such junk column.

I'm willing to work on that if I'm not on a wrong way here.

-- 
Kyotaro Horiguchi


Kyotaro HORIGUCHI <kyota.horiguchi@gmail.com> writes:
> If my understanding about non-system junk columns in a base relation
> and identifiers of a foreign tuples are correct, what is needed here
> is giving base relations the ability to have such junk column.

The core of the problem, I think, is the question of exactly what
postgresAddForeignUpdateTargets should put into the resjunk expressions
it adds to an update/delete query's targetlist.  Per discussion yesterday,
up to now it's always emitted Vars referencing the foreign relation,
which is problematic because with that approach the desired info has
to be exposed as either a regular or system column of that relation.
But there's nothing saying that the expression has to be a Var.

My thought about what we might do instead is that
postgresAddForeignUpdateTargets could reserve a PARAM_EXEC slot
and emit a Param node referencing that.  Then at runtime, while
reading a potential target row from the remote, we fill that
param slot along with the regular scan tuple slot.

What you want for the first part of that is basically like
generate_new_param() in subselect.c.  We don't expose that publicly
at the moment, but we could, or maybe better to invent another wrapper
around it like SS_make_initplan_output_param.

            regards, tom lane


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Thu, May 31, 2018 at 7:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kyotaro HORIGUCHI <kyota.horiguchi@gmail.com> writes:
>> If my understanding about non-system junk columns in a base relation
>> and identifiers of a foreign tuples are correct, what is needed here
>> is giving base relations the ability to have such junk column.
>
> The core of the problem, I think, is the question of exactly what
> postgresAddForeignUpdateTargets should put into the resjunk expressions
> it adds to an update/delete query's targetlist.  Per discussion yesterday,
> up to now it's always emitted Vars referencing the foreign relation,
> which is problematic because with that approach the desired info has
> to be exposed as either a regular or system column of that relation.
> But there's nothing saying that the expression has to be a Var.
>
> My thought about what we might do instead is that
> postgresAddForeignUpdateTargets could reserve a PARAM_EXEC slot
> and emit a Param node referencing that.  Then at runtime, while
> reading a potential target row from the remote, we fill that
> param slot along with the regular scan tuple slot.
>
> What you want for the first part of that is basically like
> generate_new_param() in subselect.c.  We don't expose that publicly
> at the moment, but we could, or maybe better to invent another wrapper
> around it like SS_make_initplan_output_param.

This looks like a lot of change which might take some time and may not
be back-portable. In the mean time, can we see if 0001 and 0002
patches are good and apply them. Those patches intend to stop the
multiple rows on the foreign server being updated by throwing error
(and aborting the transaction on the foreign server) when that
happens. That will at least avoid silent corruption that happens today
and should be back-portable.

[1] https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=kw@mail.gmail.com

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Kyotaro HORIGUCHI
Дата:
On Thu, May 31, 2018 at 11:34 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, May 31, 2018 at 7:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What you want for the first part of that is basically like
>> generate_new_param() in subselect.c.  We don't expose that publicly
>> at the moment, but we could, or maybe better to invent another wrapper
>> around it like SS_make_initplan_output_param.
>
> This looks like a lot of change which might take some time and may not

I agree. It needs  at least, in a short sight, an additional parameter
(PlannerInfo in a straightforwad way) for
postgresAddForeignUpdateTargets which is a change of FDW-API.

> be back-portable. In the mean time, can we see if 0001 and 0002
> patches are good and apply them. Those patches intend to stop the
> multiple rows on the foreign server being updated by throwing error
> (and aborting the transaction on the foreign server) when that
> happens. That will at least avoid silent corruption that happens today
> and should be back-portable.
>
> [1] https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=kw@mail.gmail.com

Having said that I think that storing oids of the remote table in
local tableoid syscolumn is a breakage of the existing contract about
the field. (I wish this is comprehensible.)
However I haven't found a way to "fix" this without such breakage of
API thus it seems to me inevitable to leave this problem as a
restriction, we still can avoid the problematic behavior by explicitly
declaring remote  tableoid column (like the "key" column option of
oracle-fdw).

CREATE FOREIGN TABLE ft1 (rtoid oid, a int, blah, blah) SERVER sv
OPTIONS (remote_tableoid 'rtoid', table_name 'lt1');

However, of-course the proposed fix will work if we allow the
a-kind-of illegal usage of the local tableoid. And it seems to be a
way to cause a series of frequent changes on the same feature.

Thoughts?


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Fri, Jun 1, 2018 at 7:43 AM, Kyotaro HORIGUCHI
<kyota.horiguchi@gmail.com> wrote:
> On Thu, May 31, 2018 at 11:34 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Thu, May 31, 2018 at 7:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> What you want for the first part of that is basically like
>>> generate_new_param() in subselect.c.  We don't expose that publicly
>>> at the moment, but we could, or maybe better to invent another wrapper
>>> around it like SS_make_initplan_output_param.
>>
>> This looks like a lot of change which might take some time and may not
>
> I agree. It needs  at least, in a short sight, an additional parameter
> (PlannerInfo in a straightforwad way) for
> postgresAddForeignUpdateTargets which is a change of FDW-API.
>
>> be back-portable. In the mean time, can we see if 0001 and 0002
>> patches are good and apply them. Those patches intend to stop the
>> multiple rows on the foreign server being updated by throwing error
>> (and aborting the transaction on the foreign server) when that
>> happens. That will at least avoid silent corruption that happens today
>> and should be back-portable.
>>
>> [1] https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=kw@mail.gmail.com
>
> Having said that I think that storing oids of the remote table in
> local tableoid syscolumn is a breakage of the existing contract about
> the field. (I wish this is comprehensible.)
> However I haven't found a way to "fix" this without such breakage of
> API thus it seems to me inevitable to leave this problem as a
> restriction, we still can avoid the problematic behavior by explicitly
> declaring remote  tableoid column (like the "key" column option of
> oracle-fdw).
>
> CREATE FOREIGN TABLE ft1 (rtoid oid, a int, blah, blah) SERVER sv
> OPTIONS (remote_tableoid 'rtoid', table_name 'lt1');
>
> However, of-course the proposed fix will work if we allow the
> a-kind-of illegal usage of the local tableoid. And it seems to be a
> way to cause a series of frequent changes on the same feature.
>
> Thoughts?


I am not suggesting to commit 0003 in my patch set, but just 0001 and
0002 which just raise an error when multiple rows get updated when
only one row is expected to be updated.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
> I am not suggesting to commit 0003 in my patch set, but just 0001 and
> 0002 which just raise an error when multiple rows get updated when
> only one row is expected to be updated.

I reconsidered Tom's suggestion and found a way to fix this
problem avoiding FDW-API change.


To make use of PARAM_EXECs here, the attached PoC patch does the
following things. No changes in the core side.

- postgresAddForeignUpdateTargets is no longer useful, thus it is
 removed from fdw_function in the attached patch.

- GetForeignRelSize registers table oid and ctid columns into
  attrs_used and a new member param_attrs on updates.

- postgresGetForeignPlan assigns two PARAM_EXECs for the two
 values, then remember the paramids in fdw_private.

- postgresPlanForeignModify searches for the parameters and
  remember their paramids.


After that, doing the following things fixes the issue.

- make_tuple_tuple_from_result_row receives remote table oid and
  stores it to the returned tuples.

- postgresIterateForeignScan stores the values into remembered
  parameters.

- postgresExecForeignUpdate/Delete read the parameters and
  specify remote victims using them accurately.


It fails on some join-pushdown cases since it doesn't add tid
columns to join tlist.  I suppose that build_tlist_to_deparse
needs something but I'll consider further tomorrow.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d272719ff4..503e705c4c 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1107,11 +1107,17 @@ deparseTargetList(StringInfo buf,
                   bool qualify_col,
                   List **retrieved_attrs)
 {
+    static int    check_attrs[4];
+    static char *check_attr_names[] = {"ctid", "oid", "tableoid"};
     TupleDesc    tupdesc = RelationGetDescr(rel);
     bool        have_wholerow;
     bool        first;
     int            i;
 
+    check_attrs[0] = SelfItemPointerAttributeNumber;
+    check_attrs[1] = ObjectIdAttributeNumber;
+    check_attrs[2] = TableOidAttributeNumber;
+    check_attrs[3] = FirstLowInvalidHeapAttributeNumber;
     *retrieved_attrs = NIL;
 
     /* If there's a whole-row reference, we'll need all the columns. */
@@ -1143,41 +1149,27 @@ deparseTargetList(StringInfo buf,
         }
     }
 
-    /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
-     */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
+    for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++)
     {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
+        int    attr = check_attrs[i];
+        char *attr_name = check_attr_names[i];
 
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
+        /* Add system columns if needed. */
+        if (bms_is_member(attr - FirstLowInvalidHeapAttributeNumber,
+                          attrs_used))
+        {
+            if (!first)
+                appendStringInfoString(buf, ", ");
+            else if (is_returning)
+                appendStringInfoString(buf, " RETURNING ");
+            first = false;
 
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
+            if (qualify_col)
+                ADD_REL_QUALIFIER(buf, rtindex);
+            appendStringInfoString(buf, attr_name);
 
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
+            *retrieved_attrs = lappend_int(*retrieved_attrs, attr);
+        }
     }
 
     /* Don't generate bad syntax if no undropped columns */
@@ -1725,7 +1717,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;            /* tableoid and ctid are always the first param */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1739,7 +1731,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1855,7 +1847,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 78b0f43ca8..7557d9add7 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -73,7 +73,9 @@ enum FdwScanPrivateIndex
      * String describing join i.e. names of relations being joined and types
      * of join, added when the scan is join
      */
-    FdwScanPrivateRelations
+    FdwScanPrivateRelations,
+
+    FdwScanTupleIdParamIds
 };
 
 /*
@@ -95,7 +97,8 @@ enum FdwModifyPrivateIndex
     /* has-returning flag (as an integer Value node) */
     FdwModifyPrivateHasReturning,
     /* Integer list of attribute numbers retrieved by RETURNING */
-    FdwModifyPrivateRetrievedAttrs
+    FdwModifyPrivateRetrievedAttrs,
+    FdwModifyPrivateTidParams
 };
 
 /*
@@ -156,6 +159,8 @@ typedef struct PgFdwScanState
     MemoryContext temp_cxt;        /* context for per-tuple temporary data */
 
     int            fetch_size;        /* number of tuples per fetch */
+
+    int           *tid_params;
 } PgFdwScanState;
 
 /*
@@ -178,6 +183,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    int            *tid_params;
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -293,9 +299,6 @@ static void postgresBeginForeignScan(ForeignScanState *node, int eflags);
 static TupleTableSlot *postgresIterateForeignScan(ForeignScanState *node);
 static void postgresReScanForeignScan(ForeignScanState *node);
 static void postgresEndForeignScan(ForeignScanState *node);
-static void postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation);
 static List *postgresPlanForeignModify(PlannerInfo *root,
                           ModifyTable *plan,
                           Index resultRelation,
@@ -388,9 +391,11 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs);
+                      List *retrieved_attrs,
+                      int *tid_params);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -471,7 +476,7 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
     routine->EndForeignScan = postgresEndForeignScan;
 
     /* Functions for updating foreign tables */
-    routine->AddForeignUpdateTargets = postgresAddForeignUpdateTargets;
+    routine->AddForeignUpdateTargets = NULL;
     routine->PlanForeignModify = postgresPlanForeignModify;
     routine->BeginForeignModify = postgresBeginForeignModify;
     routine->ExecForeignInsert = postgresExecForeignInsert;
@@ -595,6 +600,26 @@ postgresGetForeignRelSize(PlannerInfo *root,
                        &fpinfo->attrs_used);
     }
 
+    /*
+     * ctid and tableoid are required for UPDATE and DELETE.
+     */
+    if (root->parse->commandType == CMD_UPDATE ||
+        root->parse->commandType == CMD_DELETE)
+    {
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           SelfItemPointerAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           TableOidAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->attrs_used =
+            bms_add_members(fpinfo->attrs_used, fpinfo->param_attrs);
+    }
+
     /*
      * Compute the selectivity and cost of the local_conds, so we don't have
      * to do it over again for each path.  The best we can do for these
@@ -1116,6 +1141,61 @@ postgresGetForeignPaths(PlannerInfo *root,
     }
 }
 
+/*
+ * Select a PARAM_EXEC number to identify the given Var as a parameter for
+ * the current subquery, or for a nestloop's inner scan.
+ * If the Var already has a param in the current context, return that one.
+ */
+static int
+assign_param_for_var(PlannerInfo *root, Var *var)
+{
+    ListCell   *ppl;
+    PlannerParamItem *pitem;
+    Index        levelsup;
+
+    /* Find the query level the Var belongs to */
+    for (levelsup = var->varlevelsup; levelsup > 0; levelsup--)
+        root = root->parent_root;
+
+    /* If there's already a matching PlannerParamItem there, just use it */
+    foreach(ppl, root->plan_params)
+    {
+        pitem = (PlannerParamItem *) lfirst(ppl);
+        if (IsA(pitem->item, Var))
+        {
+            Var           *pvar = (Var *) pitem->item;
+
+            /*
+             * This comparison must match _equalVar(), except for ignoring
+             * varlevelsup.  Note that _equalVar() ignores the location.
+             */
+            if (pvar->varno == var->varno &&
+                pvar->varattno == var->varattno &&
+                pvar->vartype == var->vartype &&
+                pvar->vartypmod == var->vartypmod &&
+                pvar->varcollid == var->varcollid &&
+                pvar->varnoold == var->varnoold &&
+                pvar->varoattno == var->varoattno)
+                return pitem->paramId;
+        }
+    }
+
+    /* Nope, so make a new one */
+    var = copyObject(var);
+    var->varlevelsup = 0;
+
+    pitem = makeNode(PlannerParamItem);
+    pitem->item = (Node *) var;
+    pitem->paramId = list_length(root->glob->paramExecTypes);
+    root->glob->paramExecTypes = lappend_oid(root->glob->paramExecTypes,
+                                             var->vartype);
+
+    root->plan_params = lappend(root->plan_params, pitem);
+
+    return pitem->paramId;
+}
+
+
 /*
  * postgresGetForeignPlan
  *        Create ForeignScan plan node which implements selected best path
@@ -1287,6 +1367,32 @@ postgresGetForeignPlan(PlannerInfo *root,
     if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
         fdw_private = lappend(fdw_private,
                               makeString(fpinfo->relation_name->data));
+    if (!bms_is_empty(fpinfo->param_attrs))
+    {
+        int *paramids = palloc(sizeof(int) * 2);
+        Var    *v;
+
+        if (list_length(fdw_private) == 3)
+            fdw_private = lappend(fdw_private, makeString(""));
+
+        v = makeNode(Var);
+        v->varno = foreignrel->relid;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = assign_param_for_var(root, v);
+
+        v = makeNode(Var);
+        v->varno = foreignrel->relid;
+        v->vartype = TIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = assign_param_for_var(root, v);
+
+        fdw_private = lappend(fdw_private, paramids);
+    }
 
     /*
      * Create the ForeignScan node for the given relation.
@@ -1368,6 +1474,9 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
                                                  FdwScanPrivateRetrievedAttrs);
     fsstate->fetch_size = intVal(list_nth(fsplan->fdw_private,
                                           FdwScanPrivateFetchSize));
+    if (list_length(fsplan->fdw_private) > FdwScanTupleIdParamIds)
+        fsstate->tid_params =
+            (int *) list_nth(fsplan->fdw_private, FdwScanTupleIdParamIds);
 
     /* Create contexts for batches of tuples and per-tuple temp workspace. */
     fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -1418,6 +1527,8 @@ postgresIterateForeignScan(ForeignScanState *node)
 {
     PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
     TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
+    EState *estate = node->ss.ps.state;
+    HeapTuple        tup;
 
     /*
      * If this is the first call after Begin or ReScan, we need to create the
@@ -1439,10 +1550,28 @@ postgresIterateForeignScan(ForeignScanState *node)
             return ExecClearTuple(slot);
     }
 
+    tup = fsstate->tuples[fsstate->next_tuple++];
+    if (fsstate->tid_params != NULL)
+    {
+        ParamExecData *prm;
+        ItemPointer      itemp;
+
+        /* set toid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[0]]);
+        prm->value = ObjectIdGetDatum(tup->t_tableOid);
+        /* set ctid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[1]]);
+        itemp = (ItemPointer) palloc(sizeof(ItemPointerData));
+        ItemPointerSet(itemp,
+                       ItemPointerGetBlockNumberNoCheck(&tup->t_self),
+                       ItemPointerGetOffsetNumberNoCheck(&tup->t_self));
+        prm->value = PointerGetDatum(itemp);
+    }
+    
     /*
      * Return the next tuple.
      */
-    ExecStoreTuple(fsstate->tuples[fsstate->next_tuple++],
+    ExecStoreTuple(tup,
                    slot,
                    InvalidBuffer,
                    false);
@@ -1530,41 +1659,41 @@ postgresEndForeignScan(ForeignScanState *node)
     /* MemoryContexts will be deleted automatically. */
 }
 
-/*
- * postgresAddForeignUpdateTargets
- *        Add resjunk column(s) needed for update/delete on a foreign table
- */
-static void
-postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation)
+static int
+find_param_for_var(PlannerInfo *root, Var *var)
 {
-    Var           *var;
-    const char *attrname;
-    TargetEntry *tle;
+    ListCell   *ppl;
+    PlannerParamItem *pitem;
+    Index        levelsup;
 
-    /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
-     */
+    /* Find the query level the Var belongs to */
+    for (levelsup = var->varlevelsup; levelsup > 0; levelsup--)
+        root = root->parent_root;
 
-    /* Make a Var representing the desired value */
-    var = makeVar(parsetree->resultRelation,
-                  SelfItemPointerAttributeNumber,
-                  TIDOID,
-                  -1,
-                  InvalidOid,
-                  0);
+    /* If there's already a matching PlannerParamItem there, just use it */
+    foreach(ppl, root->plan_params)
+    {
+        pitem = (PlannerParamItem *) lfirst(ppl);
+        if (IsA(pitem->item, Var))
+        {
+            Var           *pvar = (Var *) pitem->item;
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
+            /*
+             * This comparison must match _equalVar(), except for ignoring
+             * varlevelsup.  Note that _equalVar() ignores the location.
+             */
+            if (pvar->varno == var->varno &&
+                pvar->varattno == var->varattno &&
+                pvar->vartype == var->vartype &&
+                pvar->vartypmod == var->vartypmod &&
+                pvar->varcollid == var->varcollid &&
+                pvar->varnoold == var->varnoold &&
+                pvar->varoattno == var->varoattno)
+                return pitem->paramId;
+        }
+    }
 
-    tle = makeTargetEntry((Expr *) var,
-                          list_length(parsetree->targetList) + 1,
-                          pstrdup(attrname),
-                          true);
-
-    /* ... and add it to the query's targetlist */
-    parsetree->targetList = lappend(parsetree->targetList, tle);
+    return -1;
 }
 
 /*
@@ -1585,6 +1714,7 @@ postgresPlanForeignModify(PlannerInfo *root,
     List       *returningList = NIL;
     List       *retrieved_attrs = NIL;
     bool        doNothing = false;
+    int *paramids = NULL;
 
     initStringInfo(&sql);
 
@@ -1630,6 +1760,28 @@ postgresPlanForeignModify(PlannerInfo *root,
         }
     }
 
+    if (operation == CMD_UPDATE || operation == CMD_DELETE)
+    {
+        Var    *v;
+
+        paramids = palloc(sizeof(int) * 2);
+        v = makeNode(Var);
+        v->varno = resultRelation;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = find_param_for_var(root, v);
+        if (paramids[0] < 0)
+            elog(ERROR, "ERROR 1");
+
+        v->vartype = TIDOID;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = find_param_for_var(root, v);
+        if (paramids[1] < 0)
+            elog(ERROR, "ERROR 2");
+    }
+
     /*
      * Extract the relevant RETURNING list if any.
      */
@@ -1679,10 +1831,11 @@ postgresPlanForeignModify(PlannerInfo *root,
      * Build the fdw_private list that will be available to the executor.
      * Items in the list must match enum FdwModifyPrivateIndex, above.
      */
-    return list_make4(makeString(sql.data),
+    return list_make5(makeString(sql.data),
                       targetAttrs,
                       makeInteger((retrieved_attrs != NIL)),
-                      retrieved_attrs);
+                      retrieved_attrs,
+                      paramids);
 }
 
 /*
@@ -1702,6 +1855,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
     bool        has_returning;
     List       *retrieved_attrs;
     RangeTblEntry *rte;
+    int           *tid_params;
 
     /*
      * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
@@ -1719,6 +1873,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     FdwModifyPrivateHasReturning));
     retrieved_attrs = (List *) list_nth(fdw_private,
                                         FdwModifyPrivateRetrievedAttrs);
+    tid_params = (int *) list_nth(fdw_private, FdwModifyPrivateTidParams);
 
     /* Find RTE. */
     rte = rt_fetch(resultRelInfo->ri_RangeTableIndex,
@@ -1733,7 +1888,8 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     query,
                                     target_attrs,
                                     has_returning,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    tid_params);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -1758,7 +1914,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1813,28 +1969,31 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
+    Assert(tid_params);
+    /* Get the tableoid that was passed up as a exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1889,28 +2048,32 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
+    Assert(tid_params);
+
+    /* Get the tableoid that was passed up as a exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2058,7 +2221,8 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
                                     sql.data,
                                     targetAttrs,
                                     retrieved_attrs != NIL,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    NULL);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -3286,7 +3450,8 @@ create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs)
+                      List *retrieved_attrs,
+                      int *tid_params)
 {
     PgFdwModifyState *fmstate;
     Relation    rel = resultRelInfo->ri_RelationDesc;
@@ -3333,7 +3498,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3341,13 +3506,14 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
-        /* Find the ctid resjunk column in the subplan's result */
-        fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
-                                                          "ctid");
-        if (!AttributeNumberIsValid(fmstate->ctidAttno))
-            elog(ERROR, "could not find junk ctid column");
+        fmstate->tid_params = tid_params;
 
-        /* First transmittable parameter will be ctid */
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3430,6 +3596,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3441,10 +3608,13 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -5549,6 +5719,7 @@ make_tuple_from_result_row(PGresult *res,
     bool       *nulls;
     ItemPointer ctid = NULL;
     Oid            oid = InvalidOid;
+    Oid            toid = InvalidOid;
     ConversionLocation errpos;
     ErrorContextCallback errcallback;
     MemoryContext oldcontext;
@@ -5642,6 +5813,17 @@ make_tuple_from_result_row(PGresult *res,
                 oid = DatumGetObjectId(datum);
             }
         }
+        else if (i == TableOidAttributeNumber)
+        {
+            /* table oid */
+            if (valstr != NULL)
+            {
+                Datum        datum;
+
+                datum = DirectFunctionCall1(oidin, CStringGetDatum(valstr));
+                toid = DatumGetObjectId(datum);
+            }
+        }
         errpos.cur_attno = 0;
 
         j++;
@@ -5691,6 +5873,9 @@ make_tuple_from_result_row(PGresult *res,
     if (OidIsValid(oid))
         HeapTupleSetOid(tuple, oid);
 
+    if (OidIsValid(toid))
+        tuple->t_tableOid = toid;
+
     /* Clean up */
     MemoryContextReset(temp_context);
 
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index a5d4011e8d..39e5581125 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -108,6 +108,8 @@ typedef struct PgFdwRelationInfo
      * representing the relation.
      */
     int            relation_index;
+
+    Bitmapset  *param_attrs;            /* attrs required for modification */
 } PgFdwRelationInfo;
 
 /* in postgres_fdw.c */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index da7f52cab0..60a6fa849d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1193,6 +1193,8 @@ typedef struct ScanState
     Relation    ss_currentRelation;
     HeapScanDesc ss_currentScanDesc;
     TupleTableSlot *ss_ScanTupleSlot;
+    int            ntuple_infos;
+    Datum          tuple_info[];
 } ScanState;
 
 /* ----------------

Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180604.205828.208262556.horiguchi.kyotaro@lab.ntt.co.jp>
> Hello.
> 
> At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
> > I am not suggesting to commit 0003 in my patch set, but just 0001 and
> > 0002 which just raise an error when multiple rows get updated when
> > only one row is expected to be updated.
> 
> I reconsidered Tom's suggestion and found a way to fix this
> problem avoiding FDW-API change.

The patch just sent contains changes of execnodes.h, which is
useless.

regres.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d272719ff4..503e705c4c 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1107,11 +1107,17 @@ deparseTargetList(StringInfo buf,
                   bool qualify_col,
                   List **retrieved_attrs)
 {
+    static int    check_attrs[4];
+    static char *check_attr_names[] = {"ctid", "oid", "tableoid"};
     TupleDesc    tupdesc = RelationGetDescr(rel);
     bool        have_wholerow;
     bool        first;
     int            i;
 
+    check_attrs[0] = SelfItemPointerAttributeNumber;
+    check_attrs[1] = ObjectIdAttributeNumber;
+    check_attrs[2] = TableOidAttributeNumber;
+    check_attrs[3] = FirstLowInvalidHeapAttributeNumber;
     *retrieved_attrs = NIL;
 
     /* If there's a whole-row reference, we'll need all the columns. */
@@ -1143,41 +1149,27 @@ deparseTargetList(StringInfo buf,
         }
     }
 
-    /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
-     */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
+    for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++)
     {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
+        int    attr = check_attrs[i];
+        char *attr_name = check_attr_names[i];
 
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
+        /* Add system columns if needed. */
+        if (bms_is_member(attr - FirstLowInvalidHeapAttributeNumber,
+                          attrs_used))
+        {
+            if (!first)
+                appendStringInfoString(buf, ", ");
+            else if (is_returning)
+                appendStringInfoString(buf, " RETURNING ");
+            first = false;
 
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
+            if (qualify_col)
+                ADD_REL_QUALIFIER(buf, rtindex);
+            appendStringInfoString(buf, attr_name);
 
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
+            *retrieved_attrs = lappend_int(*retrieved_attrs, attr);
+        }
     }
 
     /* Don't generate bad syntax if no undropped columns */
@@ -1725,7 +1717,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;            /* tableoid and ctid are always the first param */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1739,7 +1731,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1855,7 +1847,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 78b0f43ca8..7557d9add7 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -73,7 +73,9 @@ enum FdwScanPrivateIndex
      * String describing join i.e. names of relations being joined and types
      * of join, added when the scan is join
      */
-    FdwScanPrivateRelations
+    FdwScanPrivateRelations,
+
+    FdwScanTupleIdParamIds
 };
 
 /*
@@ -95,7 +97,8 @@ enum FdwModifyPrivateIndex
     /* has-returning flag (as an integer Value node) */
     FdwModifyPrivateHasReturning,
     /* Integer list of attribute numbers retrieved by RETURNING */
-    FdwModifyPrivateRetrievedAttrs
+    FdwModifyPrivateRetrievedAttrs,
+    FdwModifyPrivateTidParams
 };
 
 /*
@@ -156,6 +159,8 @@ typedef struct PgFdwScanState
     MemoryContext temp_cxt;        /* context for per-tuple temporary data */
 
     int            fetch_size;        /* number of tuples per fetch */
+
+    int           *tid_params;
 } PgFdwScanState;
 
 /*
@@ -178,6 +183,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    int            *tid_params;
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -293,9 +299,6 @@ static void postgresBeginForeignScan(ForeignScanState *node, int eflags);
 static TupleTableSlot *postgresIterateForeignScan(ForeignScanState *node);
 static void postgresReScanForeignScan(ForeignScanState *node);
 static void postgresEndForeignScan(ForeignScanState *node);
-static void postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation);
 static List *postgresPlanForeignModify(PlannerInfo *root,
                           ModifyTable *plan,
                           Index resultRelation,
@@ -388,9 +391,11 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs);
+                      List *retrieved_attrs,
+                      int *tid_params);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -471,7 +476,7 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
     routine->EndForeignScan = postgresEndForeignScan;
 
     /* Functions for updating foreign tables */
-    routine->AddForeignUpdateTargets = postgresAddForeignUpdateTargets;
+    routine->AddForeignUpdateTargets = NULL;
     routine->PlanForeignModify = postgresPlanForeignModify;
     routine->BeginForeignModify = postgresBeginForeignModify;
     routine->ExecForeignInsert = postgresExecForeignInsert;
@@ -595,6 +600,26 @@ postgresGetForeignRelSize(PlannerInfo *root,
                        &fpinfo->attrs_used);
     }
 
+    /*
+     * ctid and tableoid are required for UPDATE and DELETE.
+     */
+    if (root->parse->commandType == CMD_UPDATE ||
+        root->parse->commandType == CMD_DELETE)
+    {
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           SelfItemPointerAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           TableOidAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->attrs_used =
+            bms_add_members(fpinfo->attrs_used, fpinfo->param_attrs);
+    }
+
     /*
      * Compute the selectivity and cost of the local_conds, so we don't have
      * to do it over again for each path.  The best we can do for these
@@ -1116,6 +1141,61 @@ postgresGetForeignPaths(PlannerInfo *root,
     }
 }
 
+/*
+ * Select a PARAM_EXEC number to identify the given Var as a parameter for
+ * the current subquery, or for a nestloop's inner scan.
+ * If the Var already has a param in the current context, return that one.
+ */
+static int
+assign_param_for_var(PlannerInfo *root, Var *var)
+{
+    ListCell   *ppl;
+    PlannerParamItem *pitem;
+    Index        levelsup;
+
+    /* Find the query level the Var belongs to */
+    for (levelsup = var->varlevelsup; levelsup > 0; levelsup--)
+        root = root->parent_root;
+
+    /* If there's already a matching PlannerParamItem there, just use it */
+    foreach(ppl, root->plan_params)
+    {
+        pitem = (PlannerParamItem *) lfirst(ppl);
+        if (IsA(pitem->item, Var))
+        {
+            Var           *pvar = (Var *) pitem->item;
+
+            /*
+             * This comparison must match _equalVar(), except for ignoring
+             * varlevelsup.  Note that _equalVar() ignores the location.
+             */
+            if (pvar->varno == var->varno &&
+                pvar->varattno == var->varattno &&
+                pvar->vartype == var->vartype &&
+                pvar->vartypmod == var->vartypmod &&
+                pvar->varcollid == var->varcollid &&
+                pvar->varnoold == var->varnoold &&
+                pvar->varoattno == var->varoattno)
+                return pitem->paramId;
+        }
+    }
+
+    /* Nope, so make a new one */
+    var = copyObject(var);
+    var->varlevelsup = 0;
+
+    pitem = makeNode(PlannerParamItem);
+    pitem->item = (Node *) var;
+    pitem->paramId = list_length(root->glob->paramExecTypes);
+    root->glob->paramExecTypes = lappend_oid(root->glob->paramExecTypes,
+                                             var->vartype);
+
+    root->plan_params = lappend(root->plan_params, pitem);
+
+    return pitem->paramId;
+}
+
+
 /*
  * postgresGetForeignPlan
  *        Create ForeignScan plan node which implements selected best path
@@ -1287,6 +1367,32 @@ postgresGetForeignPlan(PlannerInfo *root,
     if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
         fdw_private = lappend(fdw_private,
                               makeString(fpinfo->relation_name->data));
+    if (!bms_is_empty(fpinfo->param_attrs))
+    {
+        int *paramids = palloc(sizeof(int) * 2);
+        Var    *v;
+
+        if (list_length(fdw_private) == 3)
+            fdw_private = lappend(fdw_private, makeString(""));
+
+        v = makeNode(Var);
+        v->varno = foreignrel->relid;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = assign_param_for_var(root, v);
+
+        v = makeNode(Var);
+        v->varno = foreignrel->relid;
+        v->vartype = TIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = assign_param_for_var(root, v);
+
+        fdw_private = lappend(fdw_private, paramids);
+    }
 
     /*
      * Create the ForeignScan node for the given relation.
@@ -1368,6 +1474,9 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
                                                  FdwScanPrivateRetrievedAttrs);
     fsstate->fetch_size = intVal(list_nth(fsplan->fdw_private,
                                           FdwScanPrivateFetchSize));
+    if (list_length(fsplan->fdw_private) > FdwScanTupleIdParamIds)
+        fsstate->tid_params =
+            (int *) list_nth(fsplan->fdw_private, FdwScanTupleIdParamIds);
 
     /* Create contexts for batches of tuples and per-tuple temp workspace. */
     fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -1418,6 +1527,8 @@ postgresIterateForeignScan(ForeignScanState *node)
 {
     PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
     TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
+    EState *estate = node->ss.ps.state;
+    HeapTuple        tup;
 
     /*
      * If this is the first call after Begin or ReScan, we need to create the
@@ -1439,10 +1550,28 @@ postgresIterateForeignScan(ForeignScanState *node)
             return ExecClearTuple(slot);
     }
 
+    tup = fsstate->tuples[fsstate->next_tuple++];
+    if (fsstate->tid_params != NULL)
+    {
+        ParamExecData *prm;
+        ItemPointer      itemp;
+
+        /* set toid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[0]]);
+        prm->value = ObjectIdGetDatum(tup->t_tableOid);
+        /* set ctid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[1]]);
+        itemp = (ItemPointer) palloc(sizeof(ItemPointerData));
+        ItemPointerSet(itemp,
+                       ItemPointerGetBlockNumberNoCheck(&tup->t_self),
+                       ItemPointerGetOffsetNumberNoCheck(&tup->t_self));
+        prm->value = PointerGetDatum(itemp);
+    }
+    
     /*
      * Return the next tuple.
      */
-    ExecStoreTuple(fsstate->tuples[fsstate->next_tuple++],
+    ExecStoreTuple(tup,
                    slot,
                    InvalidBuffer,
                    false);
@@ -1530,41 +1659,41 @@ postgresEndForeignScan(ForeignScanState *node)
     /* MemoryContexts will be deleted automatically. */
 }
 
-/*
- * postgresAddForeignUpdateTargets
- *        Add resjunk column(s) needed for update/delete on a foreign table
- */
-static void
-postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation)
+static int
+find_param_for_var(PlannerInfo *root, Var *var)
 {
-    Var           *var;
-    const char *attrname;
-    TargetEntry *tle;
+    ListCell   *ppl;
+    PlannerParamItem *pitem;
+    Index        levelsup;
 
-    /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
-     */
+    /* Find the query level the Var belongs to */
+    for (levelsup = var->varlevelsup; levelsup > 0; levelsup--)
+        root = root->parent_root;
 
-    /* Make a Var representing the desired value */
-    var = makeVar(parsetree->resultRelation,
-                  SelfItemPointerAttributeNumber,
-                  TIDOID,
-                  -1,
-                  InvalidOid,
-                  0);
+    /* If there's already a matching PlannerParamItem there, just use it */
+    foreach(ppl, root->plan_params)
+    {
+        pitem = (PlannerParamItem *) lfirst(ppl);
+        if (IsA(pitem->item, Var))
+        {
+            Var           *pvar = (Var *) pitem->item;
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
+            /*
+             * This comparison must match _equalVar(), except for ignoring
+             * varlevelsup.  Note that _equalVar() ignores the location.
+             */
+            if (pvar->varno == var->varno &&
+                pvar->varattno == var->varattno &&
+                pvar->vartype == var->vartype &&
+                pvar->vartypmod == var->vartypmod &&
+                pvar->varcollid == var->varcollid &&
+                pvar->varnoold == var->varnoold &&
+                pvar->varoattno == var->varoattno)
+                return pitem->paramId;
+        }
+    }
 
-    tle = makeTargetEntry((Expr *) var,
-                          list_length(parsetree->targetList) + 1,
-                          pstrdup(attrname),
-                          true);
-
-    /* ... and add it to the query's targetlist */
-    parsetree->targetList = lappend(parsetree->targetList, tle);
+    return -1;
 }
 
 /*
@@ -1585,6 +1714,7 @@ postgresPlanForeignModify(PlannerInfo *root,
     List       *returningList = NIL;
     List       *retrieved_attrs = NIL;
     bool        doNothing = false;
+    int *paramids = NULL;
 
     initStringInfo(&sql);
 
@@ -1630,6 +1760,28 @@ postgresPlanForeignModify(PlannerInfo *root,
         }
     }
 
+    if (operation == CMD_UPDATE || operation == CMD_DELETE)
+    {
+        Var    *v;
+
+        paramids = palloc(sizeof(int) * 2);
+        v = makeNode(Var);
+        v->varno = resultRelation;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = find_param_for_var(root, v);
+        if (paramids[0] < 0)
+            elog(ERROR, "ERROR 1");
+
+        v->vartype = TIDOID;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = find_param_for_var(root, v);
+        if (paramids[1] < 0)
+            elog(ERROR, "ERROR 2");
+    }
+
     /*
      * Extract the relevant RETURNING list if any.
      */
@@ -1679,10 +1831,11 @@ postgresPlanForeignModify(PlannerInfo *root,
      * Build the fdw_private list that will be available to the executor.
      * Items in the list must match enum FdwModifyPrivateIndex, above.
      */
-    return list_make4(makeString(sql.data),
+    return list_make5(makeString(sql.data),
                       targetAttrs,
                       makeInteger((retrieved_attrs != NIL)),
-                      retrieved_attrs);
+                      retrieved_attrs,
+                      paramids);
 }
 
 /*
@@ -1702,6 +1855,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
     bool        has_returning;
     List       *retrieved_attrs;
     RangeTblEntry *rte;
+    int           *tid_params;
 
     /*
      * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
@@ -1719,6 +1873,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     FdwModifyPrivateHasReturning));
     retrieved_attrs = (List *) list_nth(fdw_private,
                                         FdwModifyPrivateRetrievedAttrs);
+    tid_params = (int *) list_nth(fdw_private, FdwModifyPrivateTidParams);
 
     /* Find RTE. */
     rte = rt_fetch(resultRelInfo->ri_RangeTableIndex,
@@ -1733,7 +1888,8 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     query,
                                     target_attrs,
                                     has_returning,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    tid_params);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -1758,7 +1914,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1813,28 +1969,31 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
+    Assert(tid_params);
+    /* Get the tableoid that was passed up as a exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1889,28 +2048,32 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
+    Assert(tid_params);
+
+    /* Get the tableoid that was passed up as a exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2058,7 +2221,8 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
                                     sql.data,
                                     targetAttrs,
                                     retrieved_attrs != NIL,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    NULL);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -3286,7 +3450,8 @@ create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs)
+                      List *retrieved_attrs,
+                      int *tid_params)
 {
     PgFdwModifyState *fmstate;
     Relation    rel = resultRelInfo->ri_RelationDesc;
@@ -3333,7 +3498,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3341,13 +3506,14 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
-        /* Find the ctid resjunk column in the subplan's result */
-        fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
-                                                          "ctid");
-        if (!AttributeNumberIsValid(fmstate->ctidAttno))
-            elog(ERROR, "could not find junk ctid column");
+        fmstate->tid_params = tid_params;
 
-        /* First transmittable parameter will be ctid */
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3430,6 +3596,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3441,10 +3608,13 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -5549,6 +5719,7 @@ make_tuple_from_result_row(PGresult *res,
     bool       *nulls;
     ItemPointer ctid = NULL;
     Oid            oid = InvalidOid;
+    Oid            toid = InvalidOid;
     ConversionLocation errpos;
     ErrorContextCallback errcallback;
     MemoryContext oldcontext;
@@ -5642,6 +5813,17 @@ make_tuple_from_result_row(PGresult *res,
                 oid = DatumGetObjectId(datum);
             }
         }
+        else if (i == TableOidAttributeNumber)
+        {
+            /* table oid */
+            if (valstr != NULL)
+            {
+                Datum        datum;
+
+                datum = DirectFunctionCall1(oidin, CStringGetDatum(valstr));
+                toid = DatumGetObjectId(datum);
+            }
+        }
         errpos.cur_attno = 0;
 
         j++;
@@ -5691,6 +5873,9 @@ make_tuple_from_result_row(PGresult *res,
     if (OidIsValid(oid))
         HeapTupleSetOid(tuple, oid);
 
+    if (OidIsValid(toid))
+        tuple->t_tableOid = toid;
+
     /* Clean up */
     MemoryContextReset(temp_context);
 
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index a5d4011e8d..39e5581125 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -108,6 +108,8 @@ typedef struct PgFdwRelationInfo
      * representing the relation.
      */
     int            relation_index;
+
+    Bitmapset  *param_attrs;            /* attrs required for modification */
 } PgFdwRelationInfo;
 
 /* in postgres_fdw.c */

Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180604.205828.208262556.horiguchi.kyotaro@lab.ntt.co.jp>
> It fails on some join-pushdown cases since it doesn't add tid
> columns to join tlist.  I suppose that build_tlist_to_deparse
> needs something but I'll consider further tomorrow.

I made it work with a few exceptions and bumped.  PARAM_EXEC
doesn't work at all in a case where Sort exists between
ForeignUpdate and ForeignScan.

=====
explain (verbose, costs off)
update bar set f2 = f2 + 100
from
  ( select f1 from foo union all select f1+3 from foo ) ss
where bar.f1 = ss.f1;
                                  QUERY PLAN
-----------------------------------------------------------------------------
 Update on public.bar
   Update on public.bar
   Foreign Update on public.bar2
     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
...
   ->  Merge Join
         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1))
         Merge Cond: (bar2.f1 = foo.f1)
         ->  Sort
               Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
               Sort Key: bar2.f1
               ->  Foreign Scan on public.bar2
                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
                     Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM public.loct2 FOR UPDATE
=====

Even if this worked fine, it cannot be back-patched.  We need an
extra storage moves together with tuples or prevent sorts or
something like from being inserted there.


At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
> I am not suggesting to commit 0003 in my patch set, but just 0001 and
> 0002 which just raise an error when multiple rows get updated when
> only one row is expected to be updated.

So I agree to commit the two at least in order to prevent doing
wrong silently.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d272719ff4..bff216f29d 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1049,9 +1049,16 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
          * can use NoLock here.
          */
         Relation    rel = heap_open(rte->relid, NoLock);
+        Bitmapset   *attrs = fpinfo->attrs_used;
+
+        if (root->parse->commandType != CMD_UPDATE &&
+            root->parse->commandType != CMD_DELETE)
+            attrs = bms_del_member(bms_copy(attrs),
+                                   TableOidAttributeNumber -
+                                   FirstLowInvalidHeapAttributeNumber);
 
         deparseTargetList(buf, rte, foreignrel->relid, rel, false,
-                          fpinfo->attrs_used, false, retrieved_attrs);
+                          attrs, false, retrieved_attrs);
         heap_close(rel, NoLock);
     }
 }
@@ -1107,11 +1114,17 @@ deparseTargetList(StringInfo buf,
                   bool qualify_col,
                   List **retrieved_attrs)
 {
+    static int    check_attrs[4];
+    static char *check_attr_names[] = {"ctid", "oid", "tableoid"};
     TupleDesc    tupdesc = RelationGetDescr(rel);
     bool        have_wholerow;
     bool        first;
     int            i;
 
+    check_attrs[0] = SelfItemPointerAttributeNumber;
+    check_attrs[1] = ObjectIdAttributeNumber;
+    check_attrs[2] = TableOidAttributeNumber;
+    check_attrs[3] = FirstLowInvalidHeapAttributeNumber;
     *retrieved_attrs = NIL;
 
     /* If there's a whole-row reference, we'll need all the columns. */
@@ -1143,13 +1156,16 @@ deparseTargetList(StringInfo buf,
         }
     }
 
-    /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
-     */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
+    for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++)
     {
+        int    attr = check_attrs[i];
+        char *attr_name = check_attr_names[i];
+
+        /* Add system columns if needed. */
+        if (!bms_is_member(attr - FirstLowInvalidHeapAttributeNumber,
+                           attrs_used))
+            continue;
+
         if (!first)
             appendStringInfoString(buf, ", ");
         else if (is_returning)
@@ -1158,26 +1174,9 @@ deparseTargetList(StringInfo buf,
 
         if (qualify_col)
             ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
+        appendStringInfoString(buf, attr_name);
 
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
+        *retrieved_attrs = lappend_int(*retrieved_attrs, attr);
     }
 
     /* Don't generate bad syntax if no undropped columns */
@@ -1725,7 +1724,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;            /* tableoid and ctid are always the first param */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1739,7 +1738,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1855,7 +1854,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
@@ -1951,8 +1950,13 @@ deparseReturningList(StringInfo buf, RangeTblEntry *rte,
          */
         pull_varattnos((Node *) returningList, rtindex,
                        &attrs_used);
+
+        attrs_used = bms_del_member(attrs_used,
+                                    TableOidAttributeNumber -
+                                    FirstLowInvalidHeapAttributeNumber);
     }
 
+
     if (attrs_used != NULL)
         deparseTargetList(buf, rte, rtindex, rel, true, attrs_used, false,
                           retrieved_attrs);
@@ -2066,6 +2070,12 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
             ADD_REL_QUALIFIER(buf, varno);
         appendStringInfoString(buf, "oid");
     }
+    else if (varattno == TableOidAttributeNumber)
+    {
+        if (qualify_col)
+            ADD_REL_QUALIFIER(buf, varno);
+        appendStringInfoString(buf, "tableoid");
+    }
     else if (varattno < 0)
     {
         /*
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 78b0f43ca8..e574d7f51b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -73,7 +73,10 @@ enum FdwScanPrivateIndex
      * String describing join i.e. names of relations being joined and types
      * of join, added when the scan is join
      */
-    FdwScanPrivateRelations
+    FdwScanPrivateRelations,
+
+    /* Integer list of ids of EXEC_PARAM */
+    FdwScanTupleIdParamIds
 };
 
 /*
@@ -95,7 +98,9 @@ enum FdwModifyPrivateIndex
     /* has-returning flag (as an integer Value node) */
     FdwModifyPrivateHasReturning,
     /* Integer list of attribute numbers retrieved by RETURNING */
-    FdwModifyPrivateRetrievedAttrs
+    FdwModifyPrivateRetrievedAttrs,
+    /* Integer list of paramid for tableoid and ctid of source tuple */
+    FdwModifyPrivateTidParams
 };
 
 /*
@@ -156,6 +161,8 @@ typedef struct PgFdwScanState
     MemoryContext temp_cxt;        /* context for per-tuple temporary data */
 
     int            fetch_size;        /* number of tuples per fetch */
+
+    int           *tid_params;        /* EXEC_PARAM id for tuple identifier */
 } PgFdwScanState;
 
 /*
@@ -177,7 +184,7 @@ typedef struct PgFdwModifyState
     List       *retrieved_attrs;    /* attr numbers retrieved by RETURNING */
 
     /* info about parameters for prepared statement */
-    AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    int            *tid_params;    /* EXEC_PARAM ids for tuple identifier */
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -293,9 +300,6 @@ static void postgresBeginForeignScan(ForeignScanState *node, int eflags);
 static TupleTableSlot *postgresIterateForeignScan(ForeignScanState *node);
 static void postgresReScanForeignScan(ForeignScanState *node);
 static void postgresEndForeignScan(ForeignScanState *node);
-static void postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation);
 static List *postgresPlanForeignModify(PlannerInfo *root,
                           ModifyTable *plan,
                           Index resultRelation,
@@ -388,9 +392,11 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs);
+                      List *retrieved_attrs,
+                      int *tid_params);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -451,6 +457,7 @@ static void merge_fdw_options(PgFdwRelationInfo *fpinfo,
                   const PgFdwRelationInfo *fpinfo_o,
                   const PgFdwRelationInfo *fpinfo_i);
 
+static List *add_tidcols_to_tlist(List *org, Index varno);
 
 /*
  * Foreign-data wrapper handler function: return a struct with pointers
@@ -471,7 +478,6 @@ postgres_fdw_handler(PG_FUNCTION_ARGS)
     routine->EndForeignScan = postgresEndForeignScan;
 
     /* Functions for updating foreign tables */
-    routine->AddForeignUpdateTargets = postgresAddForeignUpdateTargets;
     routine->PlanForeignModify = postgresPlanForeignModify;
     routine->BeginForeignModify = postgresBeginForeignModify;
     routine->ExecForeignInsert = postgresExecForeignInsert;
@@ -595,6 +601,39 @@ postgresGetForeignRelSize(PlannerInfo *root,
                        &fpinfo->attrs_used);
     }
 
+    /*
+     * ctid and tableoid are required for target relation of UPDATE and
+     * DELETE. Join relations are handled elsewhere.
+     */
+    if (root->parse->resultRelation == baserel->relid &&
+        (root->parse->commandType == CMD_UPDATE ||
+         root->parse->commandType == CMD_DELETE))
+    {
+        Var *v;
+
+        v = makeVar(baserel->relid,
+                    TableOidAttributeNumber,
+                    OIDOID, -1, InvalidOid, 0);
+        add_new_column_to_pathtarget(baserel->reltarget, (Expr *) v);
+        v = makeVar(baserel->relid,
+                    SelfItemPointerAttributeNumber,
+                    TIDOID, -1, InvalidOid, 0);
+        add_new_column_to_pathtarget(baserel->reltarget, (Expr *) v);
+
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           SelfItemPointerAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->param_attrs =
+            bms_add_member(fpinfo->param_attrs,
+                           TableOidAttributeNumber -
+                           FirstLowInvalidHeapAttributeNumber);
+
+        fpinfo->attrs_used =
+            bms_add_members(fpinfo->attrs_used, fpinfo->param_attrs);
+    }
+
     /*
      * Compute the selectivity and cost of the local_conds, so we don't have
      * to do it over again for each path.  The best we can do for these
@@ -1116,6 +1155,94 @@ postgresGetForeignPaths(PlannerInfo *root,
     }
 }
 
+/* Find the id of a PARAM_EXEC matches to the given var */
+static int
+find_param_for_var(PlannerInfo *root, Var *var)
+{
+    ListCell   *ppl;
+    PlannerParamItem *pitem;
+    Index        levelsup;
+
+    /* Find the query level the Var belongs to */
+    for (levelsup = var->varlevelsup; levelsup > 0; levelsup--)
+        root = root->parent_root;
+
+    /* If there's already a matching PlannerParamItem there, just use it */
+    foreach(ppl, root->plan_params)
+    {
+        pitem = (PlannerParamItem *) lfirst(ppl);
+        if (IsA(pitem->item, Var))
+        {
+            Var           *pvar = (Var *) pitem->item;
+
+            /*
+             * This comparison must match _equalVar(), except for ignoring
+             * varlevelsup.  Note that _equalVar() ignores the location.
+             */
+            if (pvar->varno == var->varno &&
+                pvar->varattno == var->varattno &&
+                pvar->vartype == var->vartype &&
+                pvar->vartypmod == var->vartypmod &&
+                pvar->varcollid == var->varcollid &&
+                pvar->varnoold == var->varnoold &&
+                pvar->varoattno == var->varoattno)
+                return pitem->paramId;
+        }
+    }
+
+    return -1;
+}
+
+/*
+ * Select a PARAM_EXEC number to identify the given Var as a parameter for
+ * the current subquery, or for a nestloop's inner scan.
+ * If the Var already has a param in the current context, return that one.
+ * (copy of the function in subselect.c)
+ */
+static int
+assign_param_for_var(PlannerInfo *root, Var *var)
+{
+    int                    paramid;
+    PlannerParamItem   *pitem;
+
+    /* Return registered param if any */
+    paramid = find_param_for_var(root, var);
+    if (paramid >= 0)
+        return paramid;
+
+    /* Nope, so make a new one */
+    var = copyObject(var);
+    var->varlevelsup = 0;
+
+    pitem = makeNode(PlannerParamItem);
+    pitem->item = (Node *) var;
+    pitem->paramId = list_length(root->glob->paramExecTypes);
+    root->glob->paramExecTypes = lappend_oid(root->glob->paramExecTypes,
+                                             var->vartype);
+
+    root->plan_params = lappend(root->plan_params, pitem);
+
+    return pitem->paramId;
+}
+
+static List *
+add_tidcols_to_tlist(List *org, Index varno)
+{
+    List   *result = NIL;
+
+    result = list_copy(org);
+
+    result =
+        add_to_flat_tlist(result,
+                          list_make2(makeVar(varno, TableOidAttributeNumber,
+                                             OIDOID, -1, InvalidOid, 0),
+                                     makeVar(varno,
+                                             SelfItemPointerAttributeNumber,
+                                             TIDOID, -1, InvalidOid, 0)));
+
+    return result;
+}
+
 /*
  * postgresGetForeignPlan
  *        Create ForeignScan plan node which implements selected best path
@@ -1136,6 +1263,7 @@ postgresGetForeignPlan(PlannerInfo *root,
     List       *local_exprs = NIL;
     List       *params_list = NIL;
     List       *fdw_scan_tlist = NIL;
+    List       *fdw_return_tlist = NIL;
     List       *fdw_recheck_quals = NIL;
     List       *retrieved_attrs;
     StringInfoData sql;
@@ -1223,8 +1351,8 @@ postgresGetForeignPlan(PlannerInfo *root,
          * locally.
          */
 
-        /* Build the list of columns to be fetched from the foreign server. */
-        fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
+        /* Build the list of columns to be returned to upper node. */
+        fdw_scan_tlist = fdw_return_tlist = build_tlist_to_deparse(foreignrel);
 
         /*
          * Ensure that the outer plan produces a tuple whose descriptor
@@ -1263,6 +1391,17 @@ postgresGetForeignPlan(PlannerInfo *root,
                                                       qual);
             }
         }
+
+        /*
+         * Remote query requires tuple identifers if this relation involves
+         * the target relation of UPDATE/DELETE commands.
+         */
+        if ((root->parse->commandType == CMD_UPDATE ||
+             root->parse->commandType == CMD_DELETE) &&
+            bms_is_member(root->parse->resultRelation, foreignrel->relids))
+            fdw_scan_tlist = 
+                add_tidcols_to_tlist(fdw_return_tlist,
+                                         root->parse->resultRelation);
     }
 
     /*
@@ -1288,6 +1427,45 @@ postgresGetForeignPlan(PlannerInfo *root,
         fdw_private = lappend(fdw_private,
                               makeString(fpinfo->relation_name->data));
 
+    /*
+     * Prepare EXEC_PARAM for tuple identifier if this relation is the target
+     * relation of the current DELETE/UPDATE query.
+     */
+    if ((root->parse->commandType == CMD_DELETE ||
+         root->parse->commandType == CMD_UPDATE) &&  
+        (scan_relid ?
+         !bms_is_empty(fpinfo->param_attrs) :
+         bms_is_member(root->parse->resultRelation, foreignrel->relids)))
+    {
+        int *paramids = palloc(sizeof(int) * 2);
+        Var    *v;
+        Index    target_relid = scan_relid;
+
+        if (target_relid == 0)
+            target_relid = root->parse->resultRelation;
+
+        if (list_length(fdw_private) == 3)
+            fdw_private = lappend(fdw_private, NULL);
+
+        v = makeNode(Var);
+        v->varno = target_relid;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = assign_param_for_var(root, v);
+
+        v = makeNode(Var);
+        v->varno = target_relid;
+        v->vartype = TIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = assign_param_for_var(root, v);
+
+        fdw_private = lappend(fdw_private, paramids);
+    }
+
     /*
      * Create the ForeignScan node for the given relation.
      *
@@ -1300,7 +1478,7 @@ postgresGetForeignPlan(PlannerInfo *root,
                             scan_relid,
                             params_list,
                             fdw_private,
-                            fdw_scan_tlist,
+                            fdw_return_tlist,
                             fdw_recheck_quals,
                             outer_plan);
 }
@@ -1368,6 +1546,9 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
                                                  FdwScanPrivateRetrievedAttrs);
     fsstate->fetch_size = intVal(list_nth(fsplan->fdw_private,
                                           FdwScanPrivateFetchSize));
+    if (list_length(fsplan->fdw_private) > FdwScanTupleIdParamIds)
+        fsstate->tid_params =
+            (int *) list_nth(fsplan->fdw_private, FdwScanTupleIdParamIds);
 
     /* Create contexts for batches of tuples and per-tuple temp workspace. */
     fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -1418,6 +1599,8 @@ postgresIterateForeignScan(ForeignScanState *node)
 {
     PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
     TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
+    EState *estate = node->ss.ps.state;
+    HeapTuple        tup;
 
     /*
      * If this is the first call after Begin or ReScan, we need to create the
@@ -1439,10 +1622,30 @@ postgresIterateForeignScan(ForeignScanState *node)
             return ExecClearTuple(slot);
     }
 
+    tup = fsstate->tuples[fsstate->next_tuple++];
+
+    /* Store the remote table oid and ctid into exec parameter if requested */
+    if (fsstate->tid_params != NULL)
+    {
+        ParamExecData *prm;
+        ItemPointer      itemp;
+
+        /* set toid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[0]]);
+        prm->value = ObjectIdGetDatum(tup->t_tableOid);
+        /* set ctid */
+        prm = &(estate->es_param_exec_vals[fsstate->tid_params[1]]);
+        itemp = (ItemPointer) palloc(sizeof(ItemPointerData));
+        ItemPointerSet(itemp,
+                       ItemPointerGetBlockNumberNoCheck(&tup->t_self),
+                       ItemPointerGetOffsetNumberNoCheck(&tup->t_self));
+        prm->value = PointerGetDatum(itemp);
+    }
+
     /*
      * Return the next tuple.
      */
-    ExecStoreTuple(fsstate->tuples[fsstate->next_tuple++],
+    ExecStoreTuple(tup,
                    slot,
                    InvalidBuffer,
                    false);
@@ -1530,43 +1733,6 @@ postgresEndForeignScan(ForeignScanState *node)
     /* MemoryContexts will be deleted automatically. */
 }
 
-/*
- * postgresAddForeignUpdateTargets
- *        Add resjunk column(s) needed for update/delete on a foreign table
- */
-static void
-postgresAddForeignUpdateTargets(Query *parsetree,
-                                RangeTblEntry *target_rte,
-                                Relation target_relation)
-{
-    Var           *var;
-    const char *attrname;
-    TargetEntry *tle;
-
-    /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
-     */
-
-    /* Make a Var representing the desired value */
-    var = makeVar(parsetree->resultRelation,
-                  SelfItemPointerAttributeNumber,
-                  TIDOID,
-                  -1,
-                  InvalidOid,
-                  0);
-
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
-
-    tle = makeTargetEntry((Expr *) var,
-                          list_length(parsetree->targetList) + 1,
-                          pstrdup(attrname),
-                          true);
-
-    /* ... and add it to the query's targetlist */
-    parsetree->targetList = lappend(parsetree->targetList, tle);
-}
-
 /*
  * postgresPlanForeignModify
  *        Plan an insert/update/delete operation on a foreign table
@@ -1630,6 +1796,33 @@ postgresPlanForeignModify(PlannerInfo *root,
         }
     }
 
+    /*
+     * In the non-direct modify cases, the corresponding ForeignScan node must
+     * have stored remote tableoid and ctid as exec parameters
+     */
+    if (operation == CMD_UPDATE || operation == CMD_DELETE)
+    {
+        Var    *v;
+        int *paramids = NULL;
+
+        paramids = palloc(sizeof(int) * 2);
+        v = makeNode(Var);
+        v->varno = resultRelation;
+        v->vartype = OIDOID;
+        v->vartypmod = -1;
+        v->varcollid = InvalidOid;
+        v->varattno = TableOidAttributeNumber;
+        paramids[0] = find_param_for_var(root, v);
+        if (paramids[0] < 0)
+            elog(ERROR, "Tupler ID parameter is not found");
+
+        v->vartype = TIDOID;
+        v->varattno = SelfItemPointerAttributeNumber;
+        paramids[1] = find_param_for_var(root, v);
+        if (paramids[1] < 0)
+            elog(ERROR, "Tupler ID parameter is not found");
+    }
+
     /*
      * Extract the relevant RETURNING list if any.
      */
@@ -1679,10 +1872,11 @@ postgresPlanForeignModify(PlannerInfo *root,
      * Build the fdw_private list that will be available to the executor.
      * Items in the list must match enum FdwModifyPrivateIndex, above.
      */
-    return list_make4(makeString(sql.data),
+    return list_make5(makeString(sql.data),
                       targetAttrs,
                       makeInteger((retrieved_attrs != NIL)),
-                      retrieved_attrs);
+                      retrieved_attrs,
+                      paramids);
 }
 
 /*
@@ -1702,6 +1896,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
     bool        has_returning;
     List       *retrieved_attrs;
     RangeTblEntry *rte;
+    int           *tid_params;
 
     /*
      * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
@@ -1719,6 +1914,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     FdwModifyPrivateHasReturning));
     retrieved_attrs = (List *) list_nth(fdw_private,
                                         FdwModifyPrivateRetrievedAttrs);
+    tid_params = (int *) list_nth(fdw_private, FdwModifyPrivateTidParams);
 
     /* Find RTE. */
     rte = rt_fetch(resultRelInfo->ri_RangeTableIndex,
@@ -1733,7 +1929,8 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
                                     query,
                                     target_attrs,
                                     has_returning,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    tid_params);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -1758,7 +1955,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1813,28 +2010,31 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
-    /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    Assert(tid_params);
+    /* Get the tableoid that was passed up as an exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
+    /* Get the ctid that was passed up as an exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1889,28 +2089,32 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
-    bool        isNull;
+    Datum        toiddatum, ctiddatum;
     const char **p_values;
     PGresult   *res;
     int            n_rows;
+    int        *tid_params = fmstate->tid_params;
+    ParamExecData *prm;
 
     /* Set up the prepared statement on the remote server, if we didn't yet */
     if (!fmstate->p_name)
         prepare_foreign_modify(fmstate);
 
+    Assert(tid_params);
+
+    /* Get the tableoid that was passed up as a exec param */
+    prm = &(estate->es_param_exec_vals[tid_params[0]]);
+    toiddatum = prm->value;
+
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
-    /* shouldn't ever get a null result... */
-    if (isNull)
-        elog(ERROR, "ctid is NULL");
+    prm = &(estate->es_param_exec_vals[tid_params[1]]);
+    ctiddatum = prm->value;
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2058,7 +2262,8 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
                                     sql.data,
                                     targetAttrs,
                                     retrieved_attrs != NIL,
-                                    retrieved_attrs);
+                                    retrieved_attrs,
+                                    NULL);
 
     resultRelInfo->ri_FdwState = fmstate;
 }
@@ -2561,8 +2766,13 @@ postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
      */
     if (list_length(fdw_private) > FdwScanPrivateRelations)
     {
-        relations = strVal(list_nth(fdw_private, FdwScanPrivateRelations));
-        ExplainPropertyText("Relations", relations, es);
+        void *v = list_nth(fdw_private, FdwScanPrivateRelations);
+
+        if (v)
+        {
+            relations = strVal(v);
+            ExplainPropertyText("Relations", relations, es);
+        }
     }
 
     /*
@@ -2673,7 +2883,20 @@ estimate_path_cost_size(PlannerInfo *root,

         /* Build the list of columns to be fetched from the foreign server. */
         if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
+        {
             fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
+
+            /*
+             * If this foreign relation need to get remote tableoid and ctid,
+             * count them in costing.
+             */
+            if ((root->parse->commandType == CMD_UPDATE ||
+                 root->parse->commandType == CMD_DELETE) &&
+                bms_is_member(root->parse->resultRelation, foreignrel->relids))
+                fdw_scan_tlist = 
+                    add_tidcols_to_tlist(fdw_scan_tlist,
+                                             root->parse->resultRelation);
+        }
         else
             fdw_scan_tlist = NIL;
 
@@ -3092,7 +3315,6 @@ create_cursor(ForeignScanState *node)
     initStringInfo(&buf);
     appendStringInfo(&buf, "DECLARE c%u CURSOR FOR\n%s",
                      fsstate->cursor_number, fsstate->query);
-
     /*
      * Notice that we pass NULL for paramTypes, thus forcing the remote server
      * to infer types for all parameters.  Since we explicitly cast every
@@ -3286,7 +3508,8 @@ create_foreign_modify(EState *estate,
                       char *query,
                       List *target_attrs,
                       bool has_returning,
-                      List *retrieved_attrs)
+                      List *retrieved_attrs,
+                      int *tid_params)
 {
     PgFdwModifyState *fmstate;
     Relation    rel = resultRelInfo->ri_RelationDesc;
@@ -3333,7 +3556,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3341,13 +3564,14 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
-        /* Find the ctid resjunk column in the subplan's result */
-        fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
-                                                          "ctid");
-        if (!AttributeNumberIsValid(fmstate->ctidAttno))
-            elog(ERROR, "could not find junk ctid column");
+        fmstate->tid_params = tid_params;
 
-        /* First transmittable parameter will be ctid */
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3430,6 +3654,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3441,10 +3666,13 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -5549,6 +5777,7 @@ make_tuple_from_result_row(PGresult *res,
     bool       *nulls;
     ItemPointer ctid = NULL;
     Oid            oid = InvalidOid;
+    Oid            toid = InvalidOid;
     ConversionLocation errpos;
     ErrorContextCallback errcallback;
     MemoryContext oldcontext;
@@ -5609,10 +5838,9 @@ make_tuple_from_result_row(PGresult *res,
          * Note: we ignore system columns other than ctid and oid in result
          */
         errpos.cur_attno = i;
-        if (i > 0)
+        if (i > 0 && i <= tupdesc->natts)
         {
             /* ordinary column */
-            Assert(i <= tupdesc->natts);
             nulls[i - 1] = (valstr == NULL);
             /* Apply the input function even to nulls, to support domains */
             values[i - 1] = InputFunctionCall(&attinmeta->attinfuncs[i - 1],
@@ -5620,7 +5848,20 @@ make_tuple_from_result_row(PGresult *res,
                                               attinmeta->attioparams[i - 1],
                                               attinmeta->atttypmods[i - 1]);
         }
-        else if (i == SelfItemPointerAttributeNumber)
+        else if (i == TableOidAttributeNumber ||
+                 i == tupdesc->natts + 1)
+        {
+            /* table oid */
+            if (valstr != NULL)
+            {
+                Datum        datum;
+
+                datum = DirectFunctionCall1(oidin, CStringGetDatum(valstr));
+                toid = DatumGetObjectId(datum);
+            }
+        }
+        else if (i == SelfItemPointerAttributeNumber ||
+                 i ==  tupdesc->natts + 2)
         {
             /* ctid */
             if (valstr != NULL)
@@ -5691,6 +5932,9 @@ make_tuple_from_result_row(PGresult *res,
     if (OidIsValid(oid))
         HeapTupleSetOid(tuple, oid);
 
+    if (OidIsValid(toid))
+        tuple->t_tableOid = toid;
+
     /* Clean up */
     MemoryContextReset(temp_context);
 
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index a5d4011e8d..39e5581125 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -108,6 +108,8 @@ typedef struct PgFdwRelationInfo
      * representing the relation.
      */
     int            relation_index;
+
+    Bitmapset  *param_attrs;            /* attrs required for modification */
 } PgFdwRelationInfo;
 
 /* in postgres_fdw.c */

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello.
>
> At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20180604.205828.208262556.horiguchi.kyotaro@lab.ntt.co.jp>
 
>> It fails on some join-pushdown cases since it doesn't add tid
>> columns to join tlist.  I suppose that build_tlist_to_deparse
>> needs something but I'll consider further tomorrow.
>
> I made it work with a few exceptions and bumped.  PARAM_EXEC
> doesn't work at all in a case where Sort exists between
> ForeignUpdate and ForeignScan.
>
> =====
> explain (verbose, costs off)
> update bar set f2 = f2 + 100
> from
>   ( select f1 from foo union all select f1+3 from foo ) ss
> where bar.f1 = ss.f1;
>                                   QUERY PLAN
> -----------------------------------------------------------------------------
>  Update on public.bar
>    Update on public.bar
>    Foreign Update on public.bar2
>      Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
> ...
>    ->  Merge Join
>          Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1))
>          Merge Cond: (bar2.f1 = foo.f1)
>          ->  Sort
>                Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
>                Sort Key: bar2.f1
>                ->  Foreign Scan on public.bar2
>                      Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
>                      Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM public.loct2 FOR UPDATE
> =====

What's the problem that you faced?

>
> Even if this worked fine, it cannot be back-patched.  We need an
> extra storage moves together with tuples or prevent sorts or
> something like from being inserted there.

I think your approach still has the same problem that it's abusing the
tableOid field in the heap tuple to store tableoid from the remote as
well as local table. That's what Robert and Tom objected to [1], [2]

>
>
> At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
>> I am not suggesting to commit 0003 in my patch set, but just 0001 and
>> 0002 which just raise an error when multiple rows get updated when
>> only one row is expected to be updated.
>
> So I agree to commit the two at least in order to prevent doing
> wrong silently.

I haven't heard any committer's opinion on this one yet.

[1] https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=HCU4n30yi9_PE175itTbFK6T8JxzwkRAWAg@mail.gmail.com
[2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Thanks for the discussion.

At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRd+Bz-DwpnwsY_3uFkALmQgDRTdp_DKhxgm1H20dXs=ow@mail.gmail.com>
> On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Hello.
> >
> > At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20180604.205828.208262556.horiguchi.kyotaro@lab.ntt.co.jp>
 
> >> It fails on some join-pushdown cases since it doesn't add tid
> >> columns to join tlist.  I suppose that build_tlist_to_deparse
> >> needs something but I'll consider further tomorrow.
> >
> > I made it work with a few exceptions and bumped.  PARAM_EXEC
> > doesn't work at all in a case where Sort exists between
> > ForeignUpdate and ForeignScan.
> >
> > =====
> > explain (verbose, costs off)
> > update bar set f2 = f2 + 100
> > from
> >   ( select f1 from foo union all select f1+3 from foo ) ss
> > where bar.f1 = ss.f1;
> >                                   QUERY PLAN
> > -----------------------------------------------------------------------------
> >  Update on public.bar
> >    Update on public.bar
> >    Foreign Update on public.bar2
> >      Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
> > ...
> >    ->  Merge Join
> >          Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1))
> >          Merge Cond: (bar2.f1 = foo.f1)
> >          ->  Sort
> >                Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
> >                Sort Key: bar2.f1
> >                ->  Foreign Scan on public.bar2
> >                      Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
> >                      Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM public.loct2 FOR UPDATE
> > =====
> 
> What's the problem that you faced?

The required condtion for PARAM_EXEC to work is that executor
ensures the correspondence between the setter the reader of a
param like ExecNestLoop is doing. The Sort node breaks the
correspondence between the tuple obtained from the Foreign Scan
and that ForeignUpdate is updating. Specifically Foreign Update
upadtes the first tuple using the tableoid for the last tuple
from the Foreign Scan.

> > Even if this worked fine, it cannot be back-patched.  We need an
> > extra storage moves together with tuples or prevent sorts or
> > something like from being inserted there.
> 
> I think your approach still has the same problem that it's abusing the
> tableOid field in the heap tuple to store tableoid from the remote as
> well as local table. That's what Robert and Tom objected to [1], [2]

It's wrong understanding. PARAM_EXEC conveys remote tableoids
outside tuples and each tuple is storing correct (= local)
tableoid.

> > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
> >> I am not suggesting to commit 0003 in my patch set, but just 0001 and
> >> 0002 which just raise an error when multiple rows get updated when
> >> only one row is expected to be updated.
> >
> > So I agree to commit the two at least in order to prevent doing
> > wrong silently.
> 
> I haven't heard any committer's opinion on this one yet.
> 
> [1] https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=HCU4n30yi9_PE175itTbFK6T8JxzwkRAWAg@mail.gmail.com
> [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us

Agreed. We need any comment to proceed.

I have demonstrated and actually shown a problem of the
PARAM_EXEC case. (It seems a bit silly that I actually found the
problem after it became almost workable, though..)  If tuples
were not copied we will be able to use the address to identify a
tuple but actually they are. (Anyway we soudn't do that.)

A. Just detecting and reporting/erroring the problematic case.

B. Giving to Sort-like nodes an ability to convert PARAMS into
   junk columns.

C. Adding a space for 64bit tuple identifier in a tuple header.

D. Somehow inhibiting tuple-storing node like Sort between. (This
  should break something working.)


B seems to have possibility to fix this but I haven't have a
concrete design of it. With C, I see 2 bits of room in infomask2
and we can use one of the bits to indicate that the tuple has an
extra 64-bit tuple identifier. It could be propagated to desired
place but I'm not sure it is in acceptable shape.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Tue, Jun 12, 2018 at 8:49 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Thanks for the discussion.
>
> At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRd+Bz-DwpnwsY_3uFkALmQgDRTdp_DKhxgm1H20dXs=ow@mail.gmail.com>
>> On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> > Hello.
>> >
>> > At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
wrotein <20180604.205828.208262556.horiguchi.kyotaro@lab.ntt.co.jp>
 
>> >> It fails on some join-pushdown cases since it doesn't add tid
>> >> columns to join tlist.  I suppose that build_tlist_to_deparse
>> >> needs something but I'll consider further tomorrow.
>> >
>> > I made it work with a few exceptions and bumped.  PARAM_EXEC
>> > doesn't work at all in a case where Sort exists between
>> > ForeignUpdate and ForeignScan.
>> >
>> > =====
>> > explain (verbose, costs off)
>> > update bar set f2 = f2 + 100
>> > from
>> >   ( select f1 from foo union all select f1+3 from foo ) ss
>> > where bar.f1 = ss.f1;
>> >                                   QUERY PLAN
>> > -----------------------------------------------------------------------------
>> >  Update on public.bar
>> >    Update on public.bar
>> >    Foreign Update on public.bar2
>> >      Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
>> > ...
>> >    ->  Merge Join
>> >          Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1))
>> >          Merge Cond: (bar2.f1 = foo.f1)
>> >          ->  Sort
>> >                Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
>> >                Sort Key: bar2.f1
>> >                ->  Foreign Scan on public.bar2
>> >                      Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid
>> >                      Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM public.loct2 FOR UPDATE
>> > =====
>>
>> What's the problem that you faced?
>
> The required condtion for PARAM_EXEC to work is that executor
> ensures the correspondence between the setter the reader of a
> param like ExecNestLoop is doing. The Sort node breaks the
> correspondence between the tuple obtained from the Foreign Scan
> and that ForeignUpdate is updating. Specifically Foreign Update
> upadtes the first tuple using the tableoid for the last tuple
> from the Foreign Scan.

Ok. Thanks for the explanation.

>
>> > Even if this worked fine, it cannot be back-patched.  We need an
>> > extra storage moves together with tuples or prevent sorts or
>> > something like from being inserted there.
>>
>> I think your approach still has the same problem that it's abusing the
>> tableOid field in the heap tuple to store tableoid from the remote as
>> well as local table. That's what Robert and Tom objected to [1], [2]
>
> It's wrong understanding. PARAM_EXEC conveys remote tableoids
> outside tuples and each tuple is storing correct (= local)
> tableoid.

In the patch I saw that we were setting tableoid field of HeapTuple to
the remote table oid somewhere. Hence the comment. I might be wrong.

>
>> > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
>> >> I am not suggesting to commit 0003 in my patch set, but just 0001 and
>> >> 0002 which just raise an error when multiple rows get updated when
>> >> only one row is expected to be updated.
>> >
>> > So I agree to commit the two at least in order to prevent doing
>> > wrong silently.
>>
>> I haven't heard any committer's opinion on this one yet.
>>
>> [1] https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=HCU4n30yi9_PE175itTbFK6T8JxzwkRAWAg@mail.gmail.com
>> [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us
>
> Agreed. We need any comment to proceed.
>
> I have demonstrated and actually shown a problem of the
> PARAM_EXEC case. (It seems a bit silly that I actually found the
> problem after it became almost workable, though..)

I think the general idea behind Tom's suggestion is that we have to
use some node other than Var node when we update the targetlist with
junk columns. He suggested Param since that gives us some place to
store remote tableoid. But if that's not working, another idea (that
Tom mentioned during our discussion at PGCon) is to invent a new node
type like ForeignTableOid or something like that, which gets deparsed
to "tableoid" and evaluated to the table oid on the foreign server.
That will not work as it is since postgres_fdw code treats a foreign
table almost like a local table in many ways e.g. it uses attr_used to
know which attributes are to be requested from the foreign server,
build_tlist_to_deparse() only pulls Var nodes from the targelist of
foreign table and so on. All of those assumptions will need to change
with this approach. But good thing is because of join and aggregate
push-down we already have ability to push arbitrary kinds of nodes
down to the foreign server through the targetlist. We should be able
to leverage that capability. It looks like a lot of change, which
again doesn't seem to be back-portable.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Fri, 15 Jun 2018 11:19:21 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRd+7h7FrZC1NKLfizXJM=bjyKrh8YezZX7ExjpQdi28Tw@mail.gmail.com>
> On Tue, Jun 12, 2018 at 8:49 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Thanks for the discussion.
> >
> > At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRd+Bz-DwpnwsY_3uFkALmQgDRTdp_DKhxgm1H20dXs=ow@mail.gmail.com>
> >> What's the problem that you faced?
> >
> > The required condtion for PARAM_EXEC to work is that executor
> > ensures the correspondence between the setter the reader of a
> > param like ExecNestLoop is doing. The Sort node breaks the
> > correspondence between the tuple obtained from the Foreign Scan
> > and that ForeignUpdate is updating. Specifically Foreign Update
> > upadtes the first tuple using the tableoid for the last tuple
> > from the Foreign Scan.
> 
> Ok. Thanks for the explanation.
> 
> >
> >> > Even if this worked fine, it cannot be back-patched.  We need an
> >> > extra storage moves together with tuples or prevent sorts or
> >> > something like from being inserted there.
> >>
> >> I think your approach still has the same problem that it's abusing the
> >> tableOid field in the heap tuple to store tableoid from the remote as
> >> well as local table. That's what Robert and Tom objected to [1], [2]
> >
> > It's wrong understanding. PARAM_EXEC conveys remote tableoids
> > outside tuples and each tuple is storing correct (= local)
> > tableoid.
> 
> In the patch I saw that we were setting tableoid field of HeapTuple to
> the remote table oid somewhere. Hence the comment. I might be wrong.

You should have seen make_tuple_from_result_row. The patch sets
real tableOid to returning tuples since I didn't find an usable
storage for the per-tuple value. Afterwards the parameters are
set from tup->t_tableOid in postgresIterateForeignScan.

ForeignNext overwrites t_tableOid of returned tuples with the
foreign table's OID if system column is requested.

> >> > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdraYcQnD4tKzNuP1uP6L-gnizi4HLU_UA=28Q2M4zoDA@mail.gmail.com>
> >> >> I am not suggesting to commit 0003 in my patch set, but just 0001 and
> >> >> 0002 which just raise an error when multiple rows get updated when
> >> >> only one row is expected to be updated.
> >> >
> >> > So I agree to commit the two at least in order to prevent doing
> >> > wrong silently.
> >>
> >> I haven't heard any committer's opinion on this one yet.
> >>
> >> [1] https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=HCU4n30yi9_PE175itTbFK6T8JxzwkRAWAg@mail.gmail.com
> >> [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us
> >
> > Agreed. We need any comment to proceed.
> >
> > I have demonstrated and actually shown a problem of the
> > PARAM_EXEC case. (It seems a bit silly that I actually found the
> > problem after it became almost workable, though..)
> 
> I think the general idea behind Tom's suggestion is that we have to
> use some node other than Var node when we update the targetlist with
> junk columns. He suggested Param since that gives us some place to
> store remote tableoid. But if that's not working, another idea (that
> Tom mentioned during our discussion at PGCon) is to invent a new node
> type like ForeignTableOid or something like that, which gets deparsed
> to "tableoid" and evaluated to the table oid on the foreign server.
> That will not work as it is since postgres_fdw code treats a foreign
> table almost like a local table in many ways e.g. it uses attr_used to

I think treating a foreign table as a local object is right. But
anyway it doesn't work.

> know which attributes are to be requested from the foreign server,
> build_tlist_to_deparse() only pulls Var nodes from the targelist of
> foreign table and so on. All of those assumptions will need to change
> with this approach.

Maybe. I agree.

> But good thing is because of join and aggregate
> push-down we already have ability to push arbitrary kinds of nodes
> down to the foreign server through the targetlist. We should be able
> to leverage that capability. It looks like a lot of change, which
> again doesn't seem to be back-portable.

After some struggles as you know, I agree to the opinion. As my
first impression, giving (physical) base relations (*1) an
ability to have junk attribute is rather straightforward.

Well, is our conclusion here like this?

- For existing versions, check the errorneous situation and ERROR out.
  (documentaion will be needed.)

- For 12, we try the above thing.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
>> But good thing is because of join and aggregate
>> push-down we already have ability to push arbitrary kinds of nodes
>> down to the foreign server through the targetlist. We should be able
>> to leverage that capability. It looks like a lot of change, which
>> again doesn't seem to be back-portable.
>
> After some struggles as you know, I agree to the opinion. As my
> first impression, giving (physical) base relations (*1) an
> ability to have junk attribute is rather straightforward.

By giving base relations an ability to have junk attribute you mean to
add junk attribute in the targetlist of DML, something like
postgresAddForeignUpdateTargets(). You seem to be fine with the new
node approach described above. Just confirm.

>
> Well, is our conclusion here like this?
>
> - For existing versions, check the errorneous situation and ERROR out.
>   (documentaion will be needed.)
>
> - For 12, we try the above thing.

I think we have to see how invasive the fix is, and whether it's
back-portable. If it's back-portable, we back-port it and the problem
is fixed in previous versions as well. If not, we fix previous
versions to ERROR out instead of corrupting the database.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/06/12 12:19), Kyotaro HORIGUCHI wrote:
> I have demonstrated and actually shown a problem of the
> PARAM_EXEC case.

> A. Just detecting and reporting/erroring the problematic case.
>
> B. Giving to Sort-like nodes an ability to convert PARAMS into
>     junk columns.
>
> C. Adding a space for 64bit tuple identifier in a tuple header.
>
> D. Somehow inhibiting tuple-storing node like Sort between. (This
>    should break something working.)
>
>
> B seems to have possibility to fix this but I haven't have a
> concrete design of it.

I'm just wondering whether we could modify the planner (or executor) so 
that Params can propagate up to the ModifyTable node through all joins 
like Vars/PHVs.

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello, thank you for the comment.

At Wed, 01 Aug 2018 21:21:57 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5B61A5E5.6010707@lab.ntt.co.jp>
> (2018/06/12 12:19), Kyotaro HORIGUCHI wrote:
> > I have demonstrated and actually shown a problem of the
> > PARAM_EXEC case.
> 
> > A. Just detecting and reporting/erroring the problematic case.
> >
> > B. Giving to Sort-like nodes an ability to convert PARAMS into
> >     junk columns.
> >
> > C. Adding a space for 64bit tuple identifier in a tuple header.
> >
> > D. Somehow inhibiting tuple-storing node like Sort between. (This
> >    should break something working.)
> >
> >
> > B seems to have possibility to fix this but I haven't have a
> > concrete design of it.
> 
> I'm just wondering whether we could modify the planner (or executor)
> so that Params can propagate up to the ModifyTable node through all
> joins like Vars/PHVs.

Yeah, it's mentioned somewhere upthread. The most large obstacle
in my view is the fact that the tuple descriptor for an
RTE_RELATION baserel is tied with the relation definition. So we
need to separate the two to use "(junk) Vars/PHVs" to do that
purpose. The four above is based on the premise of EXEC_PARAMS.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Tue, 26 Jun 2018 10:19:45 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdZeiOwW+Ahj2xKACdmtirC8HzwLFNgGn4=dSsLpP8ADw@mail.gmail.com>
> On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >
> >> But good thing is because of join and aggregate
> >> push-down we already have ability to push arbitrary kinds of nodes
> >> down to the foreign server through the targetlist. We should be able
> >> to leverage that capability. It looks like a lot of change, which
> >> again doesn't seem to be back-portable.
> >
> > After some struggles as you know, I agree to the opinion. As my
> > first impression, giving (physical) base relations (*1) an
> > ability to have junk attribute is rather straightforward.
> 
> By giving base relations an ability to have junk attribute you mean to
> add junk attribute in the targetlist of DML, something like
> postgresAddForeignUpdateTargets(). You seem to be fine with the new

Maybe.

> node approach described above. Just confirm.

Something-like-but-other-hanVar node? I'm not sure it is needed,
because whatever node we add to the relation-tlist, we must add
the correspondence to the relation descriptor. And if we do that,
a Var works to point it. (Am I correctly understanding?)

> >
> > Well, is our conclusion here like this?
> >
> > - For existing versions, check the errorneous situation and ERROR out.
> >   (documentaion will be needed.)
> >
> > - For 12, we try the above thing.
> 
> I think we have to see how invasive the fix is, and whether it's
> back-portable. If it's back-portable, we back-port it and the problem
> is fixed in previous versions as well. If not, we fix previous
> versions to ERROR out instead of corrupting the database.

Mmm. Ok, I try to make a patch. Please wait for a while.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Ashutosh Bapat
Дата:
On Fri, Aug 3, 2018 at 9:43 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> Something-like-but-other-hanVar node? I'm not sure it is needed,
> because whatever node we add to the relation-tlist, we must add
> the correspondence to the relation descriptor. And if we do that,
> a Var works to point it. (Am I correctly understanding?)
>

The purpose of non-Var node is to avoid adding the attribute to
relation descriptor. Idea is to create a new node, which will act as a
place holder for table oid or row id (whatever) to be fetched from the
foreign server. I don't understand why do you think we need it to be
added to the relation descriptor.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello. Please find the attached.

At Fri, 3 Aug 2018 11:48:38 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRcF-j+B8W8o-wrvOguA0=r8SJ-rCrzWAnHT2V66NxGfFQ@mail.gmail.com>
> The purpose of non-Var node is to avoid adding the attribute to
> relation descriptor. Idea is to create a new node, which will act as a
> place holder for table oid or row id (whatever) to be fetched from the
> foreign server. I don't understand why do you think we need it to be
> added to the relation descriptor.

I choosed to expand tuple descriptor for junk column added to
foreign relaions. We might be better to have new member in
ForeignScan but I didn't so that we can backpatch it.

What the patch does are:

- This abuses ForeignScan->fdw_scan_tlist to store the additional
  junk columns when foreign simple relation scan (that is, not a
  join).

  Several places seems to be assuming that fdw_scan_tlist may be
  used foreign scan on simple relation but I didn't find that
  actually happens. This let us avoid adding new data members to
  core data structure. Separate member would be preferable for
  new version.

- The remote OID request is added to targetlist as a non-system
  junk column. get_relation_info exands per-column storage in
  creating RelOptInfo so that the additional junk columns can be
  handled.

- ExecInitForeignScan is changed so that it expands created tuple
  descriptor if it finds the junk columns stored in
  fdw_scan_tlist so that make_tuple_from_result_row can store
  them. ( ExecEvalWholeRowVar needed to modify so that it ignores
  the expanded portion of tuple descriptor.)

I'm not sure whether the following ponits are valid.

- If fdw_scan_tlist is used for simple relation scans, this would
  break the case. (ExecInitForeignScan,  set_foreignscan_references)

- I'm using the name "tableoid" for the junk column but it also
  can be used in user query. The two points to different targets
  so it doesn't matter at all, except that it makes a bit
  confusing explain output.

- Explain stuff doesn't have a crue for the name of the added
  junk. It is shown as <added_junk> in EXPLAIN output.

| Update on public.fp
|   Remote SQL: UPDATE public.p SET b = $3 WHERE tableoid = $1 AND ctid = $2
|   ->  Foreign Scan on public.fp
|         Output: a, (b + 1), "<added_junk>", ctid
|         Filter: (random() <= '1'::double precision)
|         Remote SQL: SELECT a, b, tableoid AS __remote_tableoid, ctid
|                     FROM public.p WHERE ((a = 0)) FOR UPDATE

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From fe660a5ab953d68a671861479fce0b3e60a57cd8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 8 Aug 2018 12:14:58 +0900
Subject: [PATCH 2/2] Regression test for update/delete on foreign partitioned
 table

Add test for foreign update on remote partitioned tables.
---
 contrib/postgres_fdw/expected/postgres_fdw.out | 221 ++++++++++++++++---------
 contrib/postgres_fdw/sql/postgres_fdw.sql      |  27 +++
 2 files changed, 167 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f5498c62bd..9ae329ab4f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5497,15 +5497,15 @@ INSERT INTO ft2 (c1,c2,c3)
   SELECT id, id % 10, to_char(id, 'FM00000') FROM generate_series(2001, 2010) id;
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;            -- can't be pushed down
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                         QUERY PLAN
    
 

+----------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Foreign Scan on public.ft2
-         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, ctid
+         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, "<added_junk>", ctid
          Filter: (postgres_fdw_abs(ft2.c1) > 2000)
-         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" FOR UPDATE
+         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" FOR UPDATE
 (7 rows)
 
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;
@@ -5532,13 +5532,13 @@ UPDATE ft2 SET c3 = 'baz'

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2,
ft5.c3
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Nested Loop
-         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid, ft4.*,
ft5.*,ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
+         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft2."<added_junk>",ft2.ctid, ft4.*, ft5.*, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
          Join Filter: (ft2.c2 === ft4.c1)
          ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR
UPDATE
+               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2."<added_junk>", ft2.ctid
+               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" >
2000))FOR UPDATE
 
          ->  Foreign Scan
                Output: ft4.*, ft4.c1, ft4.c2, ft4.c3, ft5.*, ft5.c1, ft5.c2, ft5.c3
                Relations: (public.ft4) INNER JOIN (public.ft5)
@@ -5570,24 +5570,24 @@ DELETE FROM ft2
   USING ft4 INNER JOIN ft5 ON (ft4.c1 === ft5.c1)
   WHERE ft2.c1 > 2000 AND ft2.c2 = ft4.c1
   RETURNING ft2.c1, ft2.c2, ft2.c3;       -- can't be pushed down
-
                                             QUERY PLAN
                                                                                                   
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+
                                                   QUERY PLAN
                                                                                                                
 

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c2, c3
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3
    ->  Foreign Scan
-         Output: ft2.ctid, ft4.*, ft5.*
+         Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft5.*
          Filter: (ft4.c1 === ft5.c1)
          Relations: ((public.ft2) INNER JOIN (public.ft4)) INNER JOIN (public.ft5)
-         Remote SQL: SELECT r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3) END, CASE WHEN
(r3.*)::textIS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2
ON(((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
+         Remote SQL: SELECT r1.tableoid, r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3)
END,CASE WHEN (r3.*)::text IS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN
"S1"."T 3" r2 ON (((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
          ->  Nested Loop
-               Output: ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
+               Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
                ->  Nested Loop
-                     Output: ft2.ctid, ft4.*, ft4.c1
+                     Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft4.c1
                      Join Filter: (ft2.c2 = ft4.c1)
                      ->  Foreign Scan on public.ft2
-                           Output: ft2.ctid, ft2.c2
-                           Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR UPDATE
+                           Output: ft2."<added_junk>", ft2.ctid, ft2.c2
+                           Remote SQL: SELECT c2, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR UPDATE
                      ->  Foreign Scan on public.ft4
                            Output: ft4.*, ft4.c1
                            Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
@@ -6229,13 +6229,13 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN                                            
+--------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6243,13 +6243,13 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN                                             
+---------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6316,14 +6316,14 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6331,14 +6331,14 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6808,13 +6808,13 @@ BEFORE UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6832,13 +6832,13 @@ AFTER UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6866,13 +6866,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_before_delete ON rem1;
@@ -6890,13 +6890,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                               QUERY PLAN                               
-------------------------------------------------------------------------
+                                        QUERY PLAN                                        
+------------------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_after_delete ON rem1;
@@ -7147,12 +7147,12 @@ select * from bar where f1 in (select f1 from foo) for share;
 -- Check UPDATE with inherited target and an inherited source table
 explain (verbose, costs off)
 update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
@@ -7171,12 +7171,12 @@ update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
                                  Output: foo2.ctid, foo2.*, foo2.tableoid, foo2.f1
                                  Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
    ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.*, foo.tableoid
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
          Hash Cond: (bar2.f1 = foo.f1)
          ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
+               Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Hash
                Output: foo.ctid, foo.*, foo.tableoid, foo.f1
                ->  HashAggregate
@@ -7208,12 +7208,12 @@ update bar set f2 = f2 + 100
 from
   ( select f1 from foo union all select f1+3 from foo ) ss
 where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                            QUERY PLAN                                            
+--------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
          Hash Cond: (foo.f1 = bar.f1)
@@ -7233,14 +7233,14 @@ where bar.f1 = ss.f1;
                ->  Seq Scan on public.bar
                      Output: bar.f1, bar.f2, bar.ctid
    ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, (ROW(foo.f1))
          Merge Cond: (bar2.f1 = foo.f1)
          ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
                Sort Key: bar2.f1
                ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+                     Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
+                     Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Sort
                Output: (ROW(foo.f1)), foo.f1
                Sort Key: foo.f1
@@ -7438,17 +7438,17 @@ AFTER UPDATE OR DELETE ON bar2
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 explain (verbose, costs off)
 update bar set f2 = f2 + 100;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                               QUERY PLAN                                               
+--------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.f1, (bar.f2 + 100), bar.ctid
    ->  Foreign Scan on public.bar2
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, bar2.*
+         Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
 (9 rows)
 
 update bar set f2 = f2 + 100;
@@ -7466,18 +7466,18 @@ NOTICE:  trig_row_after(23, skidoo) AFTER ROW UPDATE ON bar2
 NOTICE:  OLD: (7,277,77),NEW: (7,377,77)
 explain (verbose, costs off)
 delete from bar where f2 < 400;
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                              QUERY PLAN                                               
+-------------------------------------------------------------------------------------------------------
  Delete on public.bar
    Delete on public.bar
    Foreign Delete on public.bar2
-     Remote SQL: DELETE FROM public.loct2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: DELETE FROM public.loct2 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.ctid
          Filter: (bar.f2 < 400)
    ->  Foreign Scan on public.bar2
-         Output: bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
+         Output: bar2."<added_junk>", bar2.ctid, bar2.*
+         Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
 (10 rows)
 
 delete from bar where f2 < 400;
@@ -7568,6 +7568,65 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 -- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid, ctid, * FROM fp1;
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,1) | 0 | 1
+    16638 | (0,1) | 1 | 1
+(2 rows)
+
+-- random() causes non-direct foreign update
+EXPLAIN VERBOSE UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Update on public.fp1  (cost=100.00..144.31 rows=3 width=18)
+   Remote SQL: UPDATE public.p1 SET b = $3 WHERE tableoid = $1 AND ctid = $2
+   ->  Foreign Scan on public.fp1  (cost=100.00..144.31 rows=3 width=18)
+         Output: a, (b + 1), "<added_junk>", ctid
+         Filter: (random() <= '1'::double precision)
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE
+(6 rows)
+
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be updated
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,2) | 0 | 2
+    16638 | (0,1) | 1 | 1
+(2 rows)
+
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be deleted
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,1) | 0 | 1
+(1 row)
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table c1
+drop cascades to table c2
+-- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
 -- Test insert tuple routing
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index e1b955f3f0..7b9dc027a0 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1846,6 +1846,33 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 
+-- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid, ctid, * FROM fp1;
+-- random() causes non-direct foreign update
+EXPLAIN VERBOSE UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be updated
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be deleted
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+
 -- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
-- 
2.16.3

From b7ef61b5fe14392fc2288ebe553d368fe83923d5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 8 Aug 2018 12:15:04 +0900
Subject: [PATCH 1/2] Fix foreign update on remote partitioned tables

postgres_fdw's non-direct foreign update was using only ctid to
identify the remote tuple in the second update step. This can cause
false updates/deletes on the remote side. This patch lets foreign
scans to use remote table oid along with ctid as remote tuple
identifier.
---
 contrib/file_fdw/file_fdw.c            |   2 +-
 contrib/postgres_fdw/deparse.c         | 135 +++++++++++++++------------
 contrib/postgres_fdw/postgres_fdw.c    | 161 +++++++++++++++++++++++++--------
 src/backend/executor/execExprInterp.c  |  41 +++++++--
 src/backend/executor/nodeForeignscan.c |  44 ++++++++-
 src/backend/foreign/foreign.c          |  13 ++-
 src/backend/optimizer/plan/setrefs.c   |   2 +-
 src/backend/optimizer/util/plancat.c   |  41 ++++++++-
 src/backend/utils/adt/ruleutils.c      |   6 +-
 src/include/foreign/foreign.h          |   3 +-
 10 files changed, 330 insertions(+), 118 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 2cf09aecf6..4c03700191 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -453,7 +453,7 @@ get_file_fdw_attribute_options(Oid relid)
         if (attr->attisdropped)
             continue;
 
-        options = GetForeignColumnOptions(relid, attnum);
+        options = GetForeignColumnOptions(relid, attnum, false);
         foreach(lc, options)
         {
             DefElem    *def = (DefElem *) lfirst(lc);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6001f4d25e..9e5b0e3cc0 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1088,6 +1088,42 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
     }
 }
 
+/*
+ * Adds one element in target/returning list if it is in attrs_used.
+ *
+ * If deparsestr is given, just use it. Otherwise resolves the name using rte.
+ */
+static inline void
+deparseAddTargetListItem(StringInfo buf,
+                         List **retrieved_attrs, Bitmapset *attrs_used,
+                         Index rtindex, AttrNumber attnum,
+                         char *deparsestr, RangeTblEntry *rte,
+                         bool is_returning, bool qualify_col,
+                         bool have_wholerow, bool *first)
+{
+    if (!have_wholerow &&
+        !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, attrs_used))
+        return;
+
+    if (!*first)
+        appendStringInfoString(buf, ", ");
+    else if (is_returning)
+        appendStringInfoString(buf, " RETURNING ");
+    *first = false;
+
+    if (deparsestr)
+    {
+        if (qualify_col)
+            ADD_REL_QUALIFIER(buf, rtindex);
+
+        appendStringInfoString(buf, deparsestr);
+    }
+    else
+        deparseColumnRef(buf, rtindex, attnum, rte, qualify_col);
+    
+    *retrieved_attrs = lappend_int(*retrieved_attrs, attnum);
+}
+
 /*
  * Emit a target list that retrieves the columns specified in attrs_used.
  * This is used for both SELECT and RETURNING targetlists; the is_returning
@@ -1128,58 +1164,27 @@ deparseTargetList(StringInfo buf,
         if (attr->attisdropped)
             continue;
 
-        if (have_wholerow ||
-            bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-                          attrs_used))
-        {
-            if (!first)
-                appendStringInfoString(buf, ", ");
-            else if (is_returning)
-                appendStringInfoString(buf, " RETURNING ");
-            first = false;
-
-            deparseColumnRef(buf, rtindex, i, rte, qualify_col);
-
-            *retrieved_attrs = lappend_int(*retrieved_attrs, i);
-        }
+        deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                                 rtindex, i, NULL, rte,
+                                 is_returning, qualify_col, have_wholerow,
+                                 &first);
     }
 
     /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
+     * Add ctid, oid and tableoid if needed. The attribute name and number are
+     * assigned in postgresAddForeignUpdateTargets.
      */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
-    }
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, tupdesc->natts + 1, "tableoid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, SelfItemPointerAttributeNumber, "ctid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, ObjectIdAttributeNumber, "oid",
+                             NULL, is_returning, qualify_col, false, &first);
 
     /* Don't generate bad syntax if no undropped columns */
     if (first && !is_returning)
@@ -1728,7 +1733,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;                    /* tableoid and ctid always precede */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1742,7 +1747,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1858,7 +1863,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
@@ -2033,7 +2038,7 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs)
 
         /* Use attribute name or column_name option. */
         colname = NameStr(TupleDescAttr(tupdesc, i)->attname);
-        options = GetForeignColumnOptions(relid, i + 1);
+        options = GetForeignColumnOptions(relid, i + 1, false);
 
         foreach(lc, options)
         {
@@ -2160,7 +2165,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
     }
     else
     {
-        char       *colname = NULL;
+        const char *colname = NULL;
         List       *options;
         ListCell   *lc;
 
@@ -2171,7 +2176,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * If it's a column of a foreign table, and it has the column_name FDW
          * option, use that value.
          */
-        options = GetForeignColumnOptions(rte->relid, varattno);
+        options = GetForeignColumnOptions(rte->relid, varattno, true);
         foreach(lc, options)
         {
             DefElem    *def = (DefElem *) lfirst(lc);
@@ -2188,11 +2193,29 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * FDW option, use attribute name.
          */
         if (colname == NULL)
-            colname = get_attname(rte->relid, varattno, false);
+            colname = get_attname(rte->relid, varattno, true);
+
+        if (colname == NULL)
+        {
+            /*
+             * This may be additional junk column. Make sure it is that.
+             * We must already have required lock on the relation.
+             */
+            Relation rel = heap_open(rte->relid, NoLock);
+            int natts = RelationGetNumberOfAttributes(rel);
+            heap_close(rel, NoLock);
+
+            /* XX: shouldn't we use the same message with get_attname? */
+            if (varattno != natts + 1)
+                elog(ERROR, "name resolution failed for attribute %d of relation %u",
+                     varattno, rte->relid);
+                
+            colname = "tableoid";
+        }
 
         if (qualify_col)
             ADD_REL_QUALIFIER(buf, varno);
-
+        
         appendStringInfoString(buf, quote_identifier(colname));
     }
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0803c30a48..162fbeed48 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -179,6 +179,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    AttrNumber    toidAttno;        /* attnum of input resjunk tableoid column */
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -392,6 +393,7 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       List *retrieved_attrs);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -1140,10 +1142,13 @@ postgresGetForeignPlan(PlannerInfo *root,
     List       *fdw_recheck_quals = NIL;
     List       *retrieved_attrs;
     StringInfoData sql;
-    ListCell   *lc;
 
     if (IS_SIMPLE_REL(foreignrel))
     {
+        Relation frel;
+        int         base_nattrs;
+        ListCell *lc;
+
         /*
          * For base relations, set scan_relid as the relid of the relation.
          */
@@ -1191,6 +1196,29 @@ postgresGetForeignPlan(PlannerInfo *root,
          * should recheck all the remote quals.
          */
         fdw_recheck_quals = remote_exprs;
+
+        /*
+         * We may have put tableoid junk column to the targetlist. Add the
+         * junk column to fdw_scan_tlist so that core can take care of it.  We
+         * should have only one junk column but we don't premise that here.
+         */
+        frel = heap_open(foreigntableid, NoLock);
+        base_nattrs = RelationGetNumberOfAttributes(frel);
+        heap_close(frel, NoLock);
+        
+        foreach (lc, root->parse->targetList)
+        {
+            TargetEntry *tle = lfirst_node(TargetEntry, lc);
+            Var *var = (Var *) tle->expr;
+
+            /*
+             * We need only additional non-system junk vars for the scanned
+             * relation here
+             */
+            if (tle->resjunk && IsA(var, Var) &&
+                base_nattrs < var->varattno && var->varno == scan_relid)
+                fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+        }
     }
     else
     {
@@ -1383,16 +1411,12 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
      * into local representation and error reporting during that process.
      */
     if (fsplan->scan.scanrelid > 0)
-    {
         fsstate->rel = node->ss.ss_currentRelation;
-        fsstate->tupdesc = RelationGetDescr(fsstate->rel);
-    }
     else
-    {
         fsstate->rel = NULL;
-        fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
-    }
 
+    /* We use the tuple descriptor privided by core */
+    fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
     /*
@@ -1541,14 +1565,41 @@ postgresAddForeignUpdateTargets(Query *parsetree,
                                 Relation target_relation)
 {
     Var           *var;
-    const char *attrname;
     TargetEntry *tle;
 
     /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
+     * In postgres_fdw, what we need is the tableoid and ctid, same as for a
+     * regular table.
      */
 
-    /* Make a Var representing the desired value */
+    /*
+     * Table OID is needed to retrieved as a non-system junk column in the
+     * returning tuple. We add it as a column after all regular columns.
+     */
+    var = makeVar(parsetree->resultRelation,
+                  RelationGetNumberOfAttributes(target_relation) + 1,
+                  OIDOID,
+                  -1,
+                  InvalidOid,
+                  0);
+
+    /*
+     * Wrap it in a resjunk TLE with a name accessible later by FDW. However
+     * we can use an arbitrary resname since this won't be used in remote
+     * query and this column is not used to join with other relations, just
+     * use understandable name. Doesn't seem that we explicitly free this tle
+     * but give pstrdup'ed string here just in case.
+     */
+    tle = makeTargetEntry((Expr *) var,
+                          list_length(parsetree->targetList) + 1,
+                          pstrdup("tableoid"),
+                          true);
+
+    /* ... and add it to the query's targetlist */
+    parsetree->targetList = lappend(parsetree->targetList, tle);
+
+
+    /* Do the same for ctid */
     var = makeVar(parsetree->resultRelation,
                   SelfItemPointerAttributeNumber,
                   TIDOID,
@@ -1556,15 +1607,11 @@ postgresAddForeignUpdateTargets(Query *parsetree,
                   InvalidOid,
                   0);
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
-
     tle = makeTargetEntry((Expr *) var,
                           list_length(parsetree->targetList) + 1,
-                          pstrdup(attrname),
+                          pstrdup("ctid"),
                           true);
 
-    /* ... and add it to the query's targetlist */
     parsetree->targetList = lappend(parsetree->targetList, tle);
 }
 
@@ -1769,7 +1816,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1824,7 +1871,7 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        toiddatum, ctiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1835,17 +1882,26 @@ postgresExecForeignUpdate(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1900,7 +1956,7 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        ctiddatum, toiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1911,17 +1967,26 @@ postgresExecForeignDelete(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2458,7 +2523,6 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
             tupdesc = RelationGetDescr(dmstate->rel);
 
         dmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
-
         /*
          * When performing an UPDATE/DELETE .. RETURNING on a join directly,
          * initialize a filter to extract an updated/deleted tuple from a scan
@@ -3345,7 +3409,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3353,13 +3417,24 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
+        /* Find the remote tableoid resjunk column in the subplan's result */
+        fmstate->toidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
+                                                          "tableoid");
+        if (!AttributeNumberIsValid(fmstate->toidAttno))
+            elog(ERROR, "could not find junk tableoid column");
+
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
         /* Find the ctid resjunk column in the subplan's result */
         fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
                                                           "ctid");
         if (!AttributeNumberIsValid(fmstate->ctidAttno))
             elog(ERROR, "could not find junk ctid column");
 
-        /* First transmittable parameter will be ctid */
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3442,6 +3517,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3453,10 +3529,15 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
+        Assert (tupleid != NULL);
+
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -3685,8 +3766,8 @@ rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist)
         new_tlist = lappend(new_tlist,
                             makeTargetEntry(tle->expr,
                                             list_length(new_tlist) + 1,
-                                            NULL,
-                                            false));
+                                            tle->resname,
+                                            tle->resjunk));
     }
     fscan->fdw_scan_tlist = new_tlist;
 }
@@ -5576,12 +5657,12 @@ make_tuple_from_result_row(PGresult *res,
      */
     oldcontext = MemoryContextSwitchTo(temp_context);
 
-    if (rel)
-        tupdesc = RelationGetDescr(rel);
+    if (fsstate)
+        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     else
     {
-        Assert(fsstate);
-        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+        Assert(rel);
+        tupdesc = RelationGetDescr(rel);
     }
 
     values = (Datum *) palloc0(tupdesc->natts * sizeof(Datum));
@@ -5623,7 +5704,7 @@ make_tuple_from_result_row(PGresult *res,
         errpos.cur_attno = i;
         if (i > 0)
         {
-            /* ordinary column */
+            /* ordinary column and tableoid */
             Assert(i <= tupdesc->natts);
             nulls[i - 1] = (valstr == NULL);
             /* Apply the input function even to nulls, to support domains */
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 9d6e25aae5..c4d75c611b 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -3883,14 +3883,39 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
             slot_tupdesc = slot->tts_tupleDescriptor;
 
             if (var_tupdesc->natts != slot_tupdesc->natts)
-                ereport(ERROR,
-                        (errcode(ERRCODE_DATATYPE_MISMATCH),
-                         errmsg("table row type and query-specified row type do not match"),
-                         errdetail_plural("Table row contains %d attribute, but query expects %d.",
-                                          "Table row contains %d attributes, but query expects %d.",
-                                          slot_tupdesc->natts,
-                                          slot_tupdesc->natts,
-                                          var_tupdesc->natts)));
+            {
+                bool sane = false;
+
+                /*
+                 * Foreign scan may have added junk columns at the end of
+                 * tuple. We don't assume it as a inconsistency and just igore
+                 * them here.
+                 */
+                if (var_tupdesc->natts < slot_tupdesc->natts)
+                {
+                    int i;
+
+                    sane = true;
+                    for (i = var_tupdesc->natts; i < slot_tupdesc->natts ; i++)
+                    {
+                        if (slot_tupdesc->attrs[i].attrelid != 0)
+                        {
+                            sane = false;
+                            break;
+                        }
+                    }
+                }
+
+                if (!sane)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_DATATYPE_MISMATCH),
+                             errmsg("table row type and query-specified row type do not match"),
+                             errdetail_plural("Table row contains %d attribute, but query expects %d.",
+                                              "Table row contains %d attributes, but query expects %d.",
+                                              slot_tupdesc->natts,
+                                              slot_tupdesc->natts,
+                                              var_tupdesc->natts)));
+            }
 
             for (i = 0; i < var_tupdesc->natts; i++)
             {
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index a2a28b7ec2..3eaa23194e 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -172,10 +172,13 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
     }
 
     /*
-     * Determine the scan tuple type.  If the FDW provided a targetlist
-     * describing the scan tuples, use that; else use base relation's rowtype.
+     * Determine the scan tuple type.  If currentRelation is NULL, use the
+     * targetlist provided by the FDW; else use base relation's rowtype. FDW
+     * may have provided fdw_scan_tlist for relation scan. They must consists
+     * only of junk colums and we extend the tuple descriptor for the base
+     * relation with them.
      */
-    if (node->fdw_scan_tlist != NIL || currentRelation == NULL)
+    if (currentRelation == NULL)
     {
         TupleDesc    scan_tupdesc;
 
@@ -190,6 +193,41 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 
         /* don't trust FDWs to return tuples fulfilling NOT NULL constraints */
         scan_tupdesc = CreateTupleDescCopy(RelationGetDescr(currentRelation));
+
+        /*
+         * If we have fdw_scan_tlist here, it should consists only of junk
+         * columns.  Extend the tuple descriptor with them so that the FDW can
+         * handle the columns.
+         */
+        if (node->fdw_scan_tlist != NIL)
+        {
+            ListCell *lc;
+            AttrNumber oldnattrs PG_USED_FOR_ASSERTS_ONLY = scan_tupdesc->natts;
+            AttrNumber newnattrs =
+                scan_tupdesc->natts + list_length(node->fdw_scan_tlist);
+
+            scan_tupdesc = (TupleDesc)
+                repalloc(scan_tupdesc,
+                         offsetof(struct tupleDesc, attrs) +
+                         newnattrs * sizeof(FormData_pg_attribute));
+            scan_tupdesc->natts = newnattrs;
+                
+            foreach (lc, node->fdw_scan_tlist)
+            {
+                TargetEntry *tle = lfirst_node(TargetEntry, lc);
+                Var *var = (Var *) tle->expr;
+
+                Assert(IsA(tle->expr, Var) &&
+                           tle->resjunk && var->varattno > oldnattrs);
+                TupleDescInitEntry(scan_tupdesc,
+                                   var->varattno,
+                                   tle->resname,
+                                   var->vartype,
+                                   var->vartypmod,
+                                   0);
+            }                
+        }
+
         ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
         /* Node's targetlist will contain Vars with varno = scanrelid */
         tlistvarno = scanrelid;
diff --git a/src/backend/foreign/foreign.c b/src/backend/foreign/foreign.c
index eac78a5d31..f5c7f7af73 100644
--- a/src/backend/foreign/foreign.c
+++ b/src/backend/foreign/foreign.c
@@ -249,9 +249,12 @@ GetForeignTable(Oid relid)
 /*
  * GetForeignColumnOptions - Get attfdwoptions of given relation/attnum
  * as list of DefElem.
+ *
+ * If no such attribute exists and missing_ok is true, NIL is returned;
+ * otherwise a not-intended-for-user-consumption error is thrown.
  */
 List *
-GetForeignColumnOptions(Oid relid, AttrNumber attnum)
+GetForeignColumnOptions(Oid relid, AttrNumber attnum, bool missing_ok)
 {
     List       *options;
     HeapTuple    tp;
@@ -262,8 +265,12 @@ GetForeignColumnOptions(Oid relid, AttrNumber attnum)
                          ObjectIdGetDatum(relid),
                          Int16GetDatum(attnum));
     if (!HeapTupleIsValid(tp))
-        elog(ERROR, "cache lookup failed for attribute %d of relation %u",
-             attnum, relid);
+    {
+        if (!missing_ok)
+            elog(ERROR, "cache lookup failed for attribute %d of relation %u",
+                 attnum, relid);
+        return NIL;
+    }
     datum = SysCacheGetAttr(ATTNUM,
                             tp,
                             Anum_pg_attribute_attfdwoptions,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 69dd327f0c..3a0b67508a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1194,7 +1194,7 @@ set_foreignscan_references(PlannerInfo *root,
     if (fscan->scan.scanrelid > 0)
         fscan->scan.scanrelid += rtoffset;
 
-    if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
+    if (fscan->scan.scanrelid == 0)
     {
         /*
          * Adjust tlist, qual, fdw_exprs, fdw_recheck_quals to reference
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3ad62..cfcb912bbb 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -33,6 +33,7 @@
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/plancat.h"
@@ -58,7 +59,8 @@ int            constraint_exclusion = CONSTRAINT_EXCLUSION_PARTITION;
 /* Hook for plugins to get control in get_relation_info() */
 get_relation_info_hook_type get_relation_info_hook = NULL;
 
-
+static AttrNumber tlist_max_attrnum(List *tlist, Index varno,
+                                    AttrNumber relattrnum);
 static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
                           Relation relation, bool inhparent);
 static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel,
@@ -76,6 +78,33 @@ static PartitionScheme find_partition_scheme(PlannerInfo *root, Relation rel);
 static void set_baserel_partition_key_exprs(Relation relation,
                                 RelOptInfo *rel);
 
+/*
+ * tlist_max_attrnum
+ *   Find the largest varattno in the targetlist
+ *
+ * FDWs may add junk columns for internal usage. This function finds the
+ * maximum attribute number including such columns. Such additional columns
+ * are always Var so we don't go deeper.
+ */
+
+static AttrNumber
+tlist_max_attrnum(List *tlist, Index varno, AttrNumber relattrnum)
+{
+    AttrNumber    maxattrnum = relattrnum;
+    ListCell   *lc;
+
+    foreach (lc, tlist)
+    {
+        TargetEntry *tle = lfirst_node(TargetEntry, lc);
+        Var            *var = (Var *) tle->expr;
+
+        if (IsA(var, Var) && var->varno == varno && maxattrnum < var->varattno)
+            maxattrnum = var->varattno;
+    }
+
+    return maxattrnum;
+}
+
 /*
  * get_relation_info -
  *      Retrieves catalog information for a given relation.
@@ -112,6 +141,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
     Relation    relation;
     bool        hasindex;
     List       *indexinfos = NIL;
+    AttrNumber  max_attrnum;
 
     /*
      * We need not lock the relation since it was already locked, either by
@@ -126,8 +156,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot access temporary or unlogged relations during recovery")));
 
+    max_attrnum = RelationGetNumberOfAttributes(relation);
+
+    /* Foreign table may have exanded this relation with junk columns */
+    if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+        max_attrnum = tlist_max_attrnum(root->parse->targetList,
+                                        varno, max_attrnum);
+
     rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
-    rel->max_attr = RelationGetNumberOfAttributes(relation);
+    rel->max_attr = max_attrnum;
     rel->reltablespace = RelationGetForm(relation)->reltablespace;
 
     Assert(rel->max_attr >= rel->min_attr);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 03e9a28a63..e3b3f57e66 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -6671,9 +6671,9 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
     {
         /* Get column name to use from the colinfo struct */
         if (attnum > colinfo->num_cols)
-            elog(ERROR, "invalid attnum %d for relation \"%s\"",
-                 attnum, rte->eref->aliasname);
-        attname = colinfo->colnames[attnum - 1];
+            attname = "<added_junk>";
+        else
+            attname = colinfo->colnames[attnum - 1];
         if (attname == NULL)    /* dropped column? */
             elog(ERROR, "invalid attnum %d for relation \"%s\"",
                  attnum, rte->eref->aliasname);
diff --git a/src/include/foreign/foreign.h b/src/include/foreign/foreign.h
index 3ca12e64d2..5b1fec2be8 100644
--- a/src/include/foreign/foreign.h
+++ b/src/include/foreign/foreign.h
@@ -77,7 +77,8 @@ extern ForeignDataWrapper *GetForeignDataWrapperByName(const char *name,
                             bool missing_ok);
 extern ForeignTable *GetForeignTable(Oid relid);
 
-extern List *GetForeignColumnOptions(Oid relid, AttrNumber attnum);
+extern List *GetForeignColumnOptions(Oid relid, AttrNumber attnum,
+                                     bool missing_ok);
 
 extern Oid    get_foreign_data_wrapper_oid(const char *fdwname, bool missing_ok);
 extern Oid    get_foreign_server_oid(const char *servername, bool missing_ok);
-- 
2.16.3


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
> Please find the attached.

Thanks for the patch, Horiguchi-san!

> At Fri, 3 Aug 2018 11:48:38 +0530, Ashutosh Bapat<ashutosh.bapat@enterprisedb.com>  wrote
in<CAFjFpRcF-j+B8W8o-wrvOguA0=r8SJ-rCrzWAnHT2V66NxGfFQ@mail.gmail.com>
>> The purpose of non-Var node is to avoid adding the attribute to
>> relation descriptor. Idea is to create a new node, which will act as a
>> place holder for table oid or row id (whatever) to be fetched from the
>> foreign server.

I think so too.

>> I don't understand why do you think we need it to be
>> added to the relation descriptor.

I don't understand that either.

> I choosed to expand tuple descriptor for junk column added to
> foreign relaions. We might be better to have new member in
> ForeignScan but I didn't so that we can backpatch it.

I've not looked at the patch closely yet, but I'm not sure that it's a 
good idea to expand the tuple descriptor of the target relation on the 
fly so that it contains the remotetableoid as a non-system attribute of 
the target table.  My concern is: is there not any risk in affecting 
some other part of the planner and/or the executor?  (I was a bit 
surprised that the patch passes the regression tests successfully.)

To avoid expanding the tuple descriptor, I'm wondering whether we could 
add a Param representing remotetableoid, not a Var undefined anywhere in 
the system catalogs, as mentioned above?

> What the patch does are:
>
> - This abuses ForeignScan->fdw_scan_tlist to store the additional
>    junk columns when foreign simple relation scan (that is, not a
>    join).

I think that this issue was introduced in 9.3, which added postgres_fdw 
in combination with support for writable foreign tables, but 
fdw_scan_tlist was added to 9.5 as part of join-pushdown infrastructure, 
so my concern is that we might not be able to backpatch your patch to 
9.3 and 9.4.

That's it for now.

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/09 22:04), Etsuro Fujita wrote:
> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
>> I choosed to expand tuple descriptor for junk column added to
>> foreign relaions. We might be better to have new member in
>> ForeignScan but I didn't so that we can backpatch it.
>
> I've not looked at the patch closely yet, but I'm not sure that it's a
> good idea to expand the tuple descriptor of the target relation on the
> fly so that it contains the remotetableoid as a non-system attribute of
> the target table. My concern is: is there not any risk in affecting some
> other part of the planner and/or the executor? (I was a bit surprised
> that the patch passes the regression tests successfully.)

I spent more time looking at the patch.  ISTM that the patch well 
suppresses the effect of the tuple-descriptor expansion by making 
changes to code in the planner and executor (and ruleutils.c), but I'm 
still not sure that the patch is the right direction to go in, because 
ISTM that expanding the tuple descriptor on the fly might be a wart.

>> What the patch does are:
>>
>> - This abuses ForeignScan->fdw_scan_tlist to store the additional
>> junk columns when foreign simple relation scan (that is, not a
>> join).
>
> I think that this issue was introduced in 9.3, which added postgres_fdw
> in combination with support for writable foreign tables, but
> fdw_scan_tlist was added to 9.5 as part of join-pushdown infrastructure,
> so my concern is that we might not be able to backpatch your patch to
> 9.3 and 9.4.

Another concern about this:

You wrote:
 >    Several places seems to be assuming that fdw_scan_tlist may be
 >    used foreign scan on simple relation but I didn't find that
 >    actually happens.

Yeah, currently, postgres_fdw and file_fdw don't use that list for 
simple foreign table scans, but it could be used to improve the 
efficiency for those scans, as explained in fdwhandler.sgml:

      Another <structname>ForeignScan</structname> field that can be 
filled by FDWs
      is <structfield>fdw_scan_tlist</structfield>, which describes the 
tuples returned by
      the FDW for this plan node.  For simple foreign table scans this 
can be
      set to <literal>NIL</literal>, implying that the returned tuples 
have the
      row type declared for the foreign table.  A 
non-<symbol>NIL</symbol> value must be a
      target list (list of <structname>TargetEntry</structname>s) 
containing Vars and/or
      expressions representing the returned columns.  This might be 
used, for
      example, to show that the FDW has omitted some columns that it noticed
      won't be needed for the query.  Also, if the FDW can compute 
expressions
      used by the query more cheaply than can be done locally, it could add
      those expressions to <structfield>fdw_scan_tlist</structfield>. 
Note that join
      plans (created from paths made by 
<function>GetForeignJoinPaths</function>) must
      always supply <structfield>fdw_scan_tlist</structfield> to 
describe the set of
      columns they will return.

You wrote:
 > I'm not sure whether the following ponits are valid.
 >
 > - If fdw_scan_tlist is used for simple relation scans, this would
 >    break the case. (ExecInitForeignScan,  set_foreignscan_references)

Some FDWs might already use that list for the improved efficiency for 
simple foreign table scans as explained above, so we should avoid 
breaking that.

If we take the Param-based approach suggested by Tom, I suspect there 
would be no need to worry about at least those things, so I'll try to 
update your patch as such, if there are no objections from you (or 
anyone else).

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Fujita-san thank you for the comment.

At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5B72C1AE.8010408@lab.ntt.co.jp>
> (2018/08/09 22:04), Etsuro Fujita wrote:
> > (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
> >> I choosed to expand tuple descriptor for junk column added to
> >> foreign relaions. We might be better to have new member in
> >> ForeignScan but I didn't so that we can backpatch it.
> >
> > I've not looked at the patch closely yet, but I'm not sure that it's a
> > good idea to expand the tuple descriptor of the target relation on the
> > fly so that it contains the remotetableoid as a non-system attribute
> > of
> > the target table. My concern is: is there not any risk in affecting
> > some
> > other part of the planner and/or the executor? (I was a bit surprised
> > that the patch passes the regression tests successfully.)

Yeah, me too.

> I spent more time looking at the patch.  ISTM that the patch well
> suppresses the effect of the tuple-descriptor expansion by making
> changes to code in the planner and executor (and ruleutils.c), but I'm
> still not sure that the patch is the right direction to go in, because
> ISTM that expanding the tuple descriptor on the fly might be a wart.

The non-Var nodes seems to me the same as PARAM_EXEC, which works
imperfectly for this purpose since tableoid must be in one-to-one
correspondence with a tuple but differently from joins the
correspondence is stired up by intermedate executor nodes in some
cases.

The exapansion should be safe if the expanded descriptor has the
same defitions for base columns and all the extended coulumns are
junks. The junk columns should be ignored by unrelated nodes and
they are passed safely as far as ForeignModify passes tuples as
is from underlying ForeignScan to ForeignUpdate/Delete.

> >> What the patch does are:
> >>
> >> - This abuses ForeignScan->fdw_scan_tlist to store the additional
> >> junk columns when foreign simple relation scan (that is, not a
> >> join).
> >
> > I think that this issue was introduced in 9.3, which added
> > postgres_fdw
> > in combination with support for writable foreign tables, but
> > fdw_scan_tlist was added to 9.5 as part of join-pushdown
> > infrastructure,
> > so my concern is that we might not be able to backpatch your patch to
> > 9.3 and 9.4.

Right. So I'm thinking that the older versions just get error for
the failure case instead of get it work anyhow. Or we might be
able to use tableoid in tuple header without emitting the local
oid to users but I haven't find the way to do that.

> Another concern about this:
> 
> You wrote:
> >    Several places seems to be assuming that fdw_scan_tlist may be
> >    used foreign scan on simple relation but I didn't find that
> >    actually happens.
> 
> Yeah, currently, postgres_fdw and file_fdw don't use that list for
> simple foreign table scans, but it could be used to improve the
> efficiency for those scans, as explained in fdwhandler.sgml:
> 
>      Another <structname>ForeignScan</structname> field that can be filled
>      by FDWs
>      is <structfield>fdw_scan_tlist</structfield>, which describes the
>      tuples returned by
>      the FDW for this plan node.  For simple foreign table scans this can
>      be
>      set to <literal>NIL</literal>, implying that the returned tuples have
>      the
>      row type declared for the foreign table.  A non-<symbol>NIL</symbol>
>      value must be a
>      target list (list of <structname>TargetEntry</structname>s) containing
>      Vars and/or
>      expressions representing the returned columns.  This might be used,
>      for
>      example, to show that the FDW has omitted some columns that it noticed
>      won't be needed for the query.  Also, if the FDW can compute
>      expressions
>      used by the query more cheaply than can be done locally, it could add
>      those expressions to <structfield>fdw_scan_tlist</structfield>. Note
>      that join
>      plans (created from paths made by
>      <function>GetForeignJoinPaths</function>) must
>      always supply <structfield>fdw_scan_tlist</structfield> to describe
>      the set of
>      columns they will return.

https://www.postgresql.org/docs/devel/static/fdw-planning.html

Hmm. Thanks for the pointer, it seems to need rewrite. However,
it doesn't seem to work for non-join foreign scans, since the
core igonres it and uses local table definition. This "tweak"
won't be needed if it worked.

> You wrote:
> > I'm not sure whether the following ponits are valid.
> >
> > - If fdw_scan_tlist is used for simple relation scans, this would
> >    break the case. (ExecInitForeignScan,  set_foreignscan_references)
> 
> Some FDWs might already use that list for the improved efficiency for
> simple foreign table scans as explained above, so we should avoid
> breaking that.

I considered to use fdw_scan_tlist in that way but the core is
assuming that foreign scans with scanrelid > 0 uses the relation
descriptor. Do you have any example for that?

> If we take the Param-based approach suggested by Tom, I suspect there
> would be no need to worry about at least those things, so I'll try to
> update your patch as such, if there are no objections from you (or
> anyone else).

Feel free to do that since I couldn't find the way. I'll put more
consideration on using fdw_scan_tlist in the documented way.


PARAM_EXEC is single storage side channel that can work as far as
it is set and read while each tuple is handled. In this case
postgresExecForeignUpdate/Delete must be called before
postgresIterateForeignScan returns the next tuple. An apparent
failure case for this usage is the join-update case below.

https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyotaro@lab.ntt.co.jp


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180821.110132.261184472.horiguchi.kyotaro@lab.ntt.co.jp>
> > You wrote:
> > >    Several places seems to be assuming that fdw_scan_tlist may be
> > >    used foreign scan on simple relation but I didn't find that
> > >    actually happens.
> > 
> > Yeah, currently, postgres_fdw and file_fdw don't use that list for
> > simple foreign table scans, but it could be used to improve the
> > efficiency for those scans, as explained in fdwhandler.sgml:
...
> I'll put more consideration on using fdw_scan_tlist in the
> documented way.

Done. postgres_fdw now generates full fdw_scan_tlist (as
documented) for foreign relations with junk columns having a
small change in core side. However it is far less invasive than
the previous version and I believe that it dones't harm
maybe-existing use of fdw_scan_tlist on non-join rels (that is,
in the case of a subset of relation columns).

The previous patch didn't show "tableoid" in the Output list (as
"<added_junk>") of explain output but this does correctly by
referring to rte->eref->colnames. I believe no other FDW has
expanded foreign relation even if it uses fdw_scan_tlist for
ForeignScan on a base relation so it won't harm them.

One arguable behavior change is about wholrow vars. Currently it
refferes local tuple with all columns but it is explicitly
fetched as ROW() after this patch applied. This could be fixed
but not just now.

Part of 0004-:
-  Output: f1, ''::text, ctid, rem1.*
-  Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+  Output: f1, ''::text, tableoid, ctid, rem1.*
+  Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE


Since this uses fdw_scan_tlist so it is theoretically
back-patchable back to 9.6. This patch applies on top of the
current master.

Please find the attached three files.

0001-Add-test-for-postgres_fdw-foreign-parition-update.patch

 This should fail for unpatched postgres_fdw. (Just for demonstration)

0002-Core-side-modification-for-PgFDW-foreign-update-fix.patch

 Core side change which allows fdw_scan_tlist to have extra
 columns that is not defined in the base relation.

0003-Fix-of-foreign-update-bug-of-PgFDW.patch

 Fix of postgres_fdw for this problem.

0004-Regtest-change-for-PgFDW-foreign-update-fix.patch

 Regression test change separated for readability.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

From fba71c319d1008f6dc198b8585c41f7ff0a708f1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Aug 2018 15:39:14 +0900
Subject: [PATCH 1/4] Add test for postgres_fdw foreign parition update

This add a test for the failure of updating foreign partitioned table
due to lack of distinction of remote child tables. This should fail.
---
 contrib/postgres_fdw/expected/postgres_fdw.out | 62 ++++++++++++++++++++++++++
 contrib/postgres_fdw/sql/postgres_fdw.sql      | 30 +++++++++++++
 2 files changed, 92 insertions(+)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index d912bd9d54..dd4864f006 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -7568,6 +7568,68 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 -- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+ toiddiff | ctid  | a | b 
+----------+-------+---+---
+        0 | (0,1) | 0 | 1
+        0 | (0,1) | 1 | 1
+(2 rows)
+
+-- random() causes non-direct foreign update
+EXPLAIN (VERBOSE, COSTS OFF)
+     UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Update on public.fp1
+   Remote SQL: UPDATE public.p1 SET b = $2 WHERE ctid = $1
+   ->  Foreign Scan on public.fp1
+         Output: a, (b + 1), ctid
+         Filter: (random() <= '1'::double precision)
+         Remote SQL: SELECT a, b, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE
+(6 rows)
+
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+-- Only one tuple should be updated
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+ toiddiff | ctid  | a | b 
+----------+-------+---+---
+        0 | (0,2) | 0 | 2
+        0 | (0,1) | 1 | 1
+(2 rows)
+
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+-- Only one tuple should be deleted
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+ toiddiff | ctid  | a | b 
+----------+-------+---+---
+        0 | (0,1) | 0 | 1
+(1 row)
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table c1
+drop cascades to table c2
+-- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
 -- Test insert tuple routing
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index c0b0dd949b..a821173a90 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1846,6 +1846,36 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 
+-- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+-- random() causes non-direct foreign update
+EXPLAIN (VERBOSE, COSTS OFF)
+     UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+-- Only one tuple should be updated
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+-- Only one tuple should be deleted
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1;
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+
 -- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
-- 
2.16.3

From 4e4759fbafc6c364b48cb35e8403725c56a69932 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Aug 2018 13:07:08 +0900
Subject: [PATCH 2/4] Core side modification for PgFDW foreign update fix

Currently core doesn't allow add a column that is not in relation
definition to columns in fdw_scan_tlist. This patch allows that.
---
 src/backend/optimizer/util/plancat.c | 72 +++++++++++++++++++++++++++++++++++-
 src/backend/utils/adt/ruleutils.c    | 35 +++++++++++++++---
 2 files changed, 101 insertions(+), 6 deletions(-)

diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3ad62..abef30dfd4 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -33,6 +33,7 @@
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/plancat.h"
@@ -58,7 +59,15 @@ int            constraint_exclusion = CONSTRAINT_EXCLUSION_PARTITION;
 /* Hook for plugins to get control in get_relation_info() */
 get_relation_info_hook_type get_relation_info_hook = NULL;
 
+/* context type for max_varattno() */
+typedef struct
+{
+    AttrNumber  maxattrnum;
+    Index        varno;
+} max_varattno_context;
 
+static bool max_varattno_walker(Node *node, max_varattno_context *context);
+static AttrNumber max_varattno(List *tlist, Index varno);
 static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
                           Relation relation, bool inhparent);
 static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel,
@@ -76,6 +85,43 @@ static PartitionScheme find_partition_scheme(PlannerInfo *root, Relation rel);
 static void set_baserel_partition_key_exprs(Relation relation,
                                 RelOptInfo *rel);
 
+/*
+ * max_varattno
+ *   Find the largest varattno in targetlist
+ *
+ * FDWs may add junk columns for internal usage. This function finds the
+ * maximum attribute number in the tlist.
+ */
+static AttrNumber
+max_varattno(List *tlist, Index varno)
+{
+    max_varattno_context context;
+
+    context.maxattrnum = FirstLowInvalidHeapAttributeNumber;
+    context.varno = varno;
+
+    max_varattno_walker((Node*) tlist, &context);
+
+    return context.maxattrnum;
+}
+
+static bool
+max_varattno_walker(Node *node, max_varattno_context *context)
+{
+    if (node == NULL)
+        return false;
+    if (IsA(node, Var))
+    {
+        Var    *var = (Var *) node;
+
+        if (var->varno == context->varno && var->varlevelsup == 0 &&
+            context->maxattrnum < var->varattno)
+            context->maxattrnum = var->varattno;
+        return false;
+    }
+    return expression_tree_walker(node, max_varattno_walker, (void *)context);
+}
+
 /*
  * get_relation_info -
  *      Retrieves catalog information for a given relation.
@@ -112,6 +158,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
     Relation    relation;
     bool        hasindex;
     List       *indexinfos = NIL;
+    AttrNumber  max_attrnum;
 
     /*
      * We need not lock the relation since it was already locked, either by
@@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot access temporary or unlogged relations during recovery")));
 
+    max_attrnum = RelationGetNumberOfAttributes(relation);
+
+    /* Foreign table may have exanded this relation with junk columns */
+    if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+    {
+        AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
+        if (max_attrnum < maxattno)
+            max_attrnum = maxattno;
+    }
+
     rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
-    rel->max_attr = RelationGetNumberOfAttributes(relation);
+    rel->max_attr = max_attrnum;
     rel->reltablespace = RelationGetForm(relation)->reltablespace;
 
     Assert(rel->max_attr >= rel->min_attr);
@@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
             relation = heap_open(rte->relid, NoLock);
 
             numattrs = RelationGetNumberOfAttributes(relation);
+
+            /*
+             * Foreign tables may have expanded with some junk columns. Punt
+             * in the case.
+             */
+            if (numattrs < rel->max_attr)
+            {
+                Assert(root->simple_rte_array[rel->relid]->relkind ==
+                       RELKIND_FOREIGN_TABLE);
+                heap_close(relation, NoLock);
+                break;
+            }
+
             for (attrno = 1; attrno <= numattrs; attrno++)
             {
                 Form_pg_attribute att_tup = TupleDescAttr(relation->rd_att,
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 03e9a28a63..d9d525e896 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -3815,16 +3815,42 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
         tupdesc = RelationGetDescr(rel);
 
         ncolumns = tupdesc->natts;
+
+        /* eref may hold names of junk columns  */
+        if (ncolumns < list_length(rte->eref->colnames))
+            ncolumns = list_length(rte->eref->colnames);
+
         real_colnames = (char **) palloc(ncolumns * sizeof(char *));
 
         for (i = 0; i < ncolumns; i++)
         {
-            Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
+            if (i < tupdesc->natts)
+            {
+                Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
 
-            if (attr->attisdropped)
-                real_colnames[i] = NULL;
+                if (attr->attisdropped)
+                    real_colnames[i] = NULL;
+                else
+                    real_colnames[i] = pstrdup(NameStr(attr->attname));
+            }
             else
-                real_colnames[i] = pstrdup(NameStr(attr->attname));
+            {
+                /*
+                 * This columns is an extended column, the name of which may
+                 * be stored in eref
+                 */
+                if (i < list_length(rte->eref->colnames))
+                {
+                    char *cname = strVal(list_nth(rte->eref->colnames, i));
+
+                    if (cname[0] == '\0')
+                        real_colnames[i] = NULL;
+                    else
+                        real_colnames[i] = cname;
+                }
+                else
+                    real_colnames[i] = NULL;
+            }
         }
         relation_close(rel, AccessShareLock);
     }
@@ -4152,7 +4178,6 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
     for (jc = 0; jc < rightcolinfo->num_new_cols; jc++)
     {
         char       *child_colname = rightcolinfo->new_colnames[jc];
-
         if (!rightcolinfo->is_new_col[jc])
         {
             /* Advance ic to next non-dropped old column of right child */
-- 
2.16.3

From 91d6ab8b8597f3df4003e3d946157124b9df1c02 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Aug 2018 16:17:24 +0900
Subject: [PATCH 3/4] Fix of foreign update bug of PgFDW

Postgres_fdw wrongly behavoes in updating foreign tables on a remote
partitioned table when direct modify is not used. This is because
postgres_fdw is forgetting that two different tuples with the same
ctid may come in the case. With this patch it uses remote tableoid in
addition to ctid to distinguish a remote tuple.
---
 contrib/postgres_fdw/deparse.c      | 153 +++++++++++--------
 contrib/postgres_fdw/postgres_fdw.c | 291 +++++++++++++++++++++++++++++++-----
 2 files changed, 346 insertions(+), 98 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6001f4d25e..6e8cd016a3 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1037,6 +1037,15 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
          */
         deparseExplicitTargetList(tlist, false, retrieved_attrs, context);
     }
+    else if (tlist != NIL)
+    {
+        /*
+         * The given tlist is that of base relation's expanded with junk
+         * columns.
+         */
+        context->params_list = NULL;
+        deparseExplicitTargetList(tlist, false, retrieved_attrs, context);
+    }
     else
     {
         /*
@@ -1088,6 +1097,42 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
     }
 }
 
+/*
+ * Adds one element in target/returning list if it is in attrs_used.
+ *
+ * If deparsestr is given, just use it. Otherwise resolves the name using rte.
+ */
+static inline void
+deparseAddTargetListItem(StringInfo buf,
+                         List **retrieved_attrs, Bitmapset *attrs_used,
+                         Index rtindex, AttrNumber attnum,
+                         char *deparsestr, RangeTblEntry *rte,
+                         bool is_returning, bool qualify_col,
+                         bool have_wholerow, bool *first)
+{
+    if (!have_wholerow &&
+        !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, attrs_used))
+        return;
+
+    if (!*first)
+        appendStringInfoString(buf, ", ");
+    else if (is_returning)
+        appendStringInfoString(buf, " RETURNING ");
+    *first = false;
+
+    if (deparsestr)
+    {
+        if (qualify_col)
+            ADD_REL_QUALIFIER(buf, rtindex);
+
+        appendStringInfoString(buf, deparsestr);
+    }
+    else
+        deparseColumnRef(buf, rtindex, attnum, rte, qualify_col);
+    
+    *retrieved_attrs = lappend_int(*retrieved_attrs, attnum);
+}
+
 /*
  * Emit a target list that retrieves the columns specified in attrs_used.
  * This is used for both SELECT and RETURNING targetlists; the is_returning
@@ -1128,58 +1173,28 @@ deparseTargetList(StringInfo buf,
         if (attr->attisdropped)
             continue;
 
-        if (have_wholerow ||
-            bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-                          attrs_used))
-        {
-            if (!first)
-                appendStringInfoString(buf, ", ");
-            else if (is_returning)
-                appendStringInfoString(buf, " RETURNING ");
-            first = false;
-
-            deparseColumnRef(buf, rtindex, i, rte, qualify_col);
-
-            *retrieved_attrs = lappend_int(*retrieved_attrs, i);
-        }
+        deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                                 rtindex, i, NULL, rte,
+                                 is_returning, qualify_col, have_wholerow,
+                                 &first);
     }
 
     /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
+     * Add ctid, oid and tableoid if needed. The attribute name and number are
+     * assigned in postgresAddForeignUpdateTargets. We currently don't support
+     * retrieving any other system columns.
      */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
-    }
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, tupdesc->natts + 1, "tableoid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, SelfItemPointerAttributeNumber, "ctid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, ObjectIdAttributeNumber, "oid",
+                             NULL, is_returning, qualify_col, false, &first);
 
     /* Don't generate bad syntax if no undropped columns */
     if (first && !is_returning)
@@ -1728,7 +1743,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;                    /* tableoid and ctid always precede */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1742,7 +1757,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1858,7 +1873,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
@@ -2160,9 +2175,11 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
     }
     else
     {
-        char       *colname = NULL;
+        char *colname = NULL;
         List       *options;
         ListCell   *lc;
+        Relation rel;
+        int natts;
 
         /* varno must not be any of OUTER_VAR, INNER_VAR and INDEX_VAR. */
         Assert(!IS_SPECIAL_VARNO(varno));
@@ -2171,16 +2188,34 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * If it's a column of a foreign table, and it has the column_name FDW
          * option, use that value.
          */
-        options = GetForeignColumnOptions(rte->relid, varattno);
-        foreach(lc, options)
-        {
-            DefElem    *def = (DefElem *) lfirst(lc);
+        rel = heap_open(rte->relid, NoLock);
+        natts = RelationGetNumberOfAttributes(rel);
+        heap_close(rel, NoLock);
 
-            if (strcmp(def->defname, "column_name") == 0)
+        if (rte->relkind == RELKIND_FOREIGN_TABLE)
+        {
+            if (varattno > 0 && varattno <= natts)
             {
-                colname = defGetString(def);
-                break;
+                options = GetForeignColumnOptions(rte->relid, varattno);
+                foreach(lc, options)
+                {
+                    DefElem    *def = (DefElem *) lfirst(lc);
+                    
+                    if (strcmp(def->defname, "column_name") == 0)
+                    {
+                        colname = defGetString(def);
+                        break;
+                    }
+                }
             }
+            else if (varattno == natts + 1)
+            {
+                /* This should be an additional junk column */
+                colname = "tableoid";
+            }
+            else
+                elog(ERROR, "name resolution failed for attribute %d of relation %u",
+                     varattno, rte->relid);
         }
 
         /*
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0803c30a48..2148867da8 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -179,6 +179,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    AttrNumber    toidAttno;        /* attnum of input resjunk tableoid column */
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -283,6 +284,12 @@ static void postgresGetForeignRelSize(PlannerInfo *root,
 static void postgresGetForeignPaths(PlannerInfo *root,
                         RelOptInfo *baserel,
                         Oid foreigntableid);
+static List *generate_scan_tlist_for_relation(PlannerInfo *root,
+                                              RelOptInfo *foreignrel,
+                                              Oid foreigntableoid,
+                                              PgFdwRelationInfo *fpinfo,
+                                              List *tlist,
+                                              List *recheck_quals);
 static ForeignScan *postgresGetForeignPlan(PlannerInfo *root,
                        RelOptInfo *foreignrel,
                        Oid foreigntableid,
@@ -392,6 +399,7 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       List *retrieved_attrs);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -1117,6 +1125,109 @@ postgresGetForeignPaths(PlannerInfo *root,
     }
 }
 
+/*
+ * generate_scan_tlist_for_relation :
+ *    Constructs fdw_scan_tlist from the followig sources.
+ *
+ * We may have appended tableoid and ctid junk columns to the parse
+ * targetlist. We need to give alternative scan tlist to planner in the
+ * case. This function returns the tlist consists of the following attributes
+ * in the order.
+ *
+ * 1. Relation attributes requested by user and needed for recheck
+       - fpinfo->attrs_used, fdw_recheck_quals and given tlist.
+ * 2. Junk columns and others in root->processed_tlist which are not added by 1
+ *
+ * If no junk column exists, returns NIL.
+ */
+static List *
+generate_scan_tlist_for_relation(PlannerInfo *root,
+                                 RelOptInfo *foreignrel, Oid foreigntableoid,
+                                 PgFdwRelationInfo *fpinfo,
+                                 List *tlist, List *recheck_quals)
+{
+    Index        frelid = foreignrel->relid;
+    List       *fdw_scan_tlist = NIL;
+    Relation    frel;
+    int            base_nattrs;
+    ListCell   *lc;
+    Bitmapset *attrs = NULL;
+    int attnum;
+
+    /*
+     * RelOptInfo has expanded number of attributes. Check it against the base
+     * relations's attribute number to determine the necessity for alternative
+     * scan target list.
+     */
+    frel = heap_open(foreigntableoid, NoLock);
+    base_nattrs = RelationGetNumberOfAttributes(frel);
+    heap_close(frel, NoLock);
+
+    if (base_nattrs == foreignrel->max_attr)
+        return NIL;
+
+    /* We have junk columns. Construct alternative scan target list. */
+
+    /* collect needed relation attributes */
+    attrs = bms_copy(fpinfo->attrs_used);
+    pull_varattnos((Node *)recheck_quals, frelid, &attrs);
+    pull_varattnos((Node *)tlist, frelid, &attrs);
+
+    /* Add relation's attributes  */
+    while ((attnum = bms_first_member(attrs)) >= 0)
+    {
+        TargetEntry *tle;
+        Form_pg_attribute attr;
+        Var *var;
+        char *name = NULL;
+
+        attnum += FirstLowInvalidHeapAttributeNumber;
+        if (attnum < 1)
+            continue;
+        if (attnum > base_nattrs)
+            break;
+
+        attr = TupleDescAttr(frel->rd_att, attnum - 1);
+        if (attr->attisdropped)
+            var = (Var *) makeNullConst(INT4OID, -1, InvalidOid);
+        else
+        {
+            var = makeVar(frelid, attnum,
+                          attr->atttypid, attr->atttypmod,
+                          attr->attcollation, 0);
+            name = pstrdup(NameStr(attr->attname));
+        }
+
+        tle = makeTargetEntry((Expr *)var,
+                              list_length(fdw_scan_tlist) + 1,
+                              name,
+                              false);
+        fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+    }
+
+    /* Add junk attributes  */
+    foreach (lc, root->processed_tlist)
+    {
+        TargetEntry *tle = lfirst_node(TargetEntry, lc);
+        Var *var = (Var *) tle->expr;
+
+        /*
+         * We aren't interested in non Vars, vars of other rels and base
+         * attributes.
+         */
+        if (IsA(var, Var) && var->varno == frelid &&
+            (var->varattno > base_nattrs || var->varattno < 1))
+        {
+            Assert(tle->resjunk);
+            tle = copyObject(tle);
+            tle->resno = list_length(fdw_scan_tlist) + 1;
+            fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+        }
+    }
+
+    return fdw_scan_tlist;
+}
+
 /*
  * postgresGetForeignPlan
  *        Create ForeignScan plan node which implements selected best path
@@ -1140,10 +1251,11 @@ postgresGetForeignPlan(PlannerInfo *root,
     List       *fdw_recheck_quals = NIL;
     List       *retrieved_attrs;
     StringInfoData sql;
-    ListCell   *lc;
 
     if (IS_SIMPLE_REL(foreignrel))
     {
+        ListCell *lc;
+
         /*
          * For base relations, set scan_relid as the relid of the relation.
          */
@@ -1191,6 +1303,17 @@ postgresGetForeignPlan(PlannerInfo *root,
          * should recheck all the remote quals.
          */
         fdw_recheck_quals = remote_exprs;
+
+        /*
+         * We may have put tableoid and ctid as junk columns to the
+         * targetlist. Generate fdw_scan_tlist in the case.
+         */
+        fdw_scan_tlist = generate_scan_tlist_for_relation(root,
+                                                          foreignrel,
+                                                          foreigntableid,
+                                                          fpinfo,
+                                                          tlist,
+                                                          fdw_recheck_quals);
     }
     else
     {
@@ -1383,16 +1506,12 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
      * into local representation and error reporting during that process.
      */
     if (fsplan->scan.scanrelid > 0)
-    {
         fsstate->rel = node->ss.ss_currentRelation;
-        fsstate->tupdesc = RelationGetDescr(fsstate->rel);
-    }
     else
-    {
         fsstate->rel = NULL;
-        fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
-    }
 
+    /* Always use the tuple descriptor privided by core */
+    fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
     /*
@@ -1543,22 +1662,30 @@ postgresAddForeignUpdateTargets(Query *parsetree,
     Var           *var;
     const char *attrname;
     TargetEntry *tle;
+    int            varattno = RelationGetNumberOfAttributes(target_relation) + 1;
 
     /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
+     * In postgres_fdw, what we need is the tableoid and ctid, same as for a
+     * regular table.
      */
 
-    /* Make a Var representing the desired value */
+    /*
+     * Table OID is needed to retrieved as a non-system junk column in the
+     * returning tuple. We add it as a column after all regular columns.
+     */
+    attrname = "tableoid";
     var = makeVar(parsetree->resultRelation,
-                  SelfItemPointerAttributeNumber,
-                  TIDOID,
+                  varattno++,
+                  OIDOID,
                   -1,
                   InvalidOid,
                   0);
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
-
+    /*
+     * Wrap it in a resjunk TLE with a name accessible later by FDW. Doesn't
+     * seem that we explicitly free this tle but give pstrdup'ed string here
+     * just in case.
+     */
     tle = makeTargetEntry((Expr *) var,
                           list_length(parsetree->targetList) + 1,
                           pstrdup(attrname),
@@ -1566,6 +1693,29 @@ postgresAddForeignUpdateTargets(Query *parsetree,
 
     /* ... and add it to the query's targetlist */
     parsetree->targetList = lappend(parsetree->targetList, tle);
+
+    /* ... also needs to have colname entry */
+    target_rte->eref->colnames =
+        lappend(target_rte->eref->colnames, makeString(pstrdup(attrname)));
+
+
+    /* Do the same for ctid */
+    attrname = "ctid";
+    var = makeVar(parsetree->resultRelation,
+                  SelfItemPointerAttributeNumber,
+                  TIDOID,
+                  -1,
+                  InvalidOid,
+                  0);
+
+    tle = makeTargetEntry((Expr *) var,
+                          list_length(parsetree->targetList) + 1,
+                          pstrdup(attrname),
+                          true);
+
+    parsetree->targetList = lappend(parsetree->targetList, tle);
+    target_rte->eref->colnames =
+        lappend(target_rte->eref->colnames, makeString(pstrdup(attrname)));
 }
 
 /*
@@ -1769,7 +1919,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1824,7 +1974,7 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        toiddatum, ctiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1835,17 +1985,26 @@ postgresExecForeignUpdate(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1900,7 +2059,7 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        ctiddatum, toiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1911,17 +2070,26 @@ postgresExecForeignDelete(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2303,6 +2471,28 @@ postgresPlanDirectModify(PlannerInfo *root,
                                                    returningList);
     }
 
+    /*
+     * The junk columns in the targetlist is no longer needed for FDW direct
+     * moidfy. Strip them so that the planner doesn't bother.
+     */
+    if (fscan->scan.scanrelid > 0 && fscan->fdw_scan_tlist != NIL)
+    {
+        List *newtlist = NIL;
+        ListCell *lc;
+
+        fscan->fdw_scan_tlist = NIL;
+        foreach (lc, subplan->targetlist)
+        {
+            TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+            /* once found junk, all the rest are also junk */
+            if (tle->resjunk)
+                continue;
+            newtlist = lappend(newtlist, tle);
+        }
+        subplan->targetlist = newtlist;
+    }
+    
     /*
      * Construct the SQL command string.
      */
@@ -2349,7 +2539,7 @@ postgresPlanDirectModify(PlannerInfo *root,
     /*
      * Update the foreign-join-related fields.
      */
-    if (fscan->scan.scanrelid == 0)
+    if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
     {
         /* No need for the outer subplan. */
         fscan->scan.plan.lefttree = NULL;
@@ -3345,7 +3535,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3353,13 +3543,24 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
+        /* Find the remote tableoid resjunk column in the subplan's result */
+        fmstate->toidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
+                                                          "tableoid");
+        if (!AttributeNumberIsValid(fmstate->toidAttno))
+            elog(ERROR, "could not find junk tableoid column");
+
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
         /* Find the ctid resjunk column in the subplan's result */
         fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
                                                           "ctid");
         if (!AttributeNumberIsValid(fmstate->ctidAttno))
             elog(ERROR, "could not find junk ctid column");
 
-        /* First transmittable parameter will be ctid */
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3442,6 +3643,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3453,10 +3655,15 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
+        Assert (tupleid != NULL);
+
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -3685,8 +3892,8 @@ rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist)
         new_tlist = lappend(new_tlist,
                             makeTargetEntry(tle->expr,
                                             list_length(new_tlist) + 1,
-                                            NULL,
-                                            false));
+                                            tle->resname,
+                                            tle->resjunk));
     }
     fscan->fdw_scan_tlist = new_tlist;
 }
@@ -5576,12 +5783,18 @@ make_tuple_from_result_row(PGresult *res,
      */
     oldcontext = MemoryContextSwitchTo(temp_context);
 
-    if (rel)
-        tupdesc = RelationGetDescr(rel);
+    /*
+     * If fdw_scan_tlist is provided for base relation, use the tuple
+     * descriptor given from planner.
+     */
+    if (!rel ||
+        (fsstate &&
+         castNode(ForeignScan, fsstate->ss.ps.plan)->fdw_scan_tlist != NULL))
+        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     else
     {
-        Assert(fsstate);
-        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+        Assert(rel);
+        tupdesc = RelationGetDescr(rel);
     }
 
     values = (Datum *) palloc0(tupdesc->natts * sizeof(Datum));
@@ -5623,7 +5836,7 @@ make_tuple_from_result_row(PGresult *res,
         errpos.cur_attno = i;
         if (i > 0)
         {
-            /* ordinary column */
+            /* ordinary column and tableoid */
             Assert(i <= tupdesc->natts);
             nulls[i - 1] = (valstr == NULL);
             /* Apply the input function even to nulls, to support domains */
-- 
2.16.3

From 3a70747cef5235fece5f48ff0c0988353f539e97 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Aug 2018 16:17:35 +0900
Subject: [PATCH 4/4] Regtest change for PgFDW foreign update fix

---
 contrib/postgres_fdw/expected/postgres_fdw.out | 172 ++++++++++++-------------
 1 file changed, 86 insertions(+), 86 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index dd4864f006..db19e206e6 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5497,15 +5497,15 @@ INSERT INTO ft2 (c1,c2,c3)
   SELECT id, id % 10, to_char(id, 'FM00000') FROM generate_series(2001, 2010) id;
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;            -- can't be pushed down
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                         QUERY PLAN
    
 

+----------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Foreign Scan on public.ft2
-         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, ctid
+         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, tableoid, ctid
          Filter: (postgres_fdw_abs(ft2.c1) > 2000)
-         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" FOR UPDATE
+         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" FOR UPDATE
 (7 rows)
 
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;
@@ -5532,13 +5532,13 @@ UPDATE ft2 SET c3 = 'baz'

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2,
ft5.c3
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Nested Loop
-         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid, ft4.*,
ft5.*,ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
+         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.tableoid,
ft2.ctid,ft4.*, ft5.*, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
          Join Filter: (ft2.c2 === ft4.c1)
          ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR
UPDATE
+               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.tableoid, ft2.ctid
+               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" >
2000))FOR UPDATE
 
          ->  Foreign Scan
                Output: ft4.*, ft4.c1, ft4.c2, ft4.c3, ft5.*, ft5.c1, ft5.c2, ft5.c3
                Relations: (public.ft4) INNER JOIN (public.ft5)
@@ -5570,24 +5570,24 @@ DELETE FROM ft2
   USING ft4 INNER JOIN ft5 ON (ft4.c1 === ft5.c1)
   WHERE ft2.c1 > 2000 AND ft2.c2 = ft4.c1
   RETURNING ft2.c1, ft2.c2, ft2.c3;       -- can't be pushed down
-
                                             QUERY PLAN
                                                                                                   
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+
                                                   QUERY PLAN
                                                                                                                
 

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c2, c3
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3
    ->  Foreign Scan
-         Output: ft2.ctid, ft4.*, ft5.*
+         Output: ft2.tableoid, ft2.ctid, ft4.*, ft5.*
          Filter: (ft4.c1 === ft5.c1)
          Relations: ((public.ft2) INNER JOIN (public.ft4)) INNER JOIN (public.ft5)
-         Remote SQL: SELECT r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3) END, CASE WHEN
(r3.*)::textIS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2
ON(((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
+         Remote SQL: SELECT r1.tableoid, r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3)
END,CASE WHEN (r3.*)::text IS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN
"S1"."T 3" r2 ON (((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
          ->  Nested Loop
-               Output: ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
+               Output: ft2.tableoid, ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
                ->  Nested Loop
-                     Output: ft2.ctid, ft4.*, ft4.c1
+                     Output: ft2.tableoid, ft2.ctid, ft4.*, ft4.c1
                      Join Filter: (ft2.c2 = ft4.c1)
                      ->  Foreign Scan on public.ft2
-                           Output: ft2.ctid, ft2.c2
-                           Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR UPDATE
+                           Output: ft2.tableoid, ft2.ctid, ft2.c2
+                           Remote SQL: SELECT "C 1", c2, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR
UPDATE
                      ->  Foreign Scan on public.ft4
                            Output: ft4.*, ft4.c1
                            Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
@@ -6229,13 +6229,13 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN
+--------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.tableoid, foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6243,13 +6243,13 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN                                            
+--------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.tableoid, foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6316,14 +6316,14 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.tableoid, foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6331,14 +6331,14 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.tableoid, foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6808,13 +6808,13 @@ BEFORE UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, tableoid, ctid, rem1.*
+         Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6832,13 +6832,13 @@ AFTER UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, tableoid, ctid, rem1.*
+         Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6866,13 +6866,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: tableoid, ctid, rem1.*
+         Remote SQL: SELECT tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_before_delete ON rem1;
@@ -6890,13 +6890,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                               QUERY PLAN                               
-------------------------------------------------------------------------
+                                        QUERY PLAN                                        
+------------------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: tableoid, ctid, rem1.*
+         Remote SQL: SELECT tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_after_delete ON rem1;
@@ -7147,12 +7147,12 @@ select * from bar where f1 in (select f1 from foo) for share;
 -- Check UPDATE with inherited target and an inherited source table
 explain (verbose, costs off)
 update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                               QUERY PLAN                                               
+--------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
@@ -7171,12 +7171,12 @@ update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
                                  Output: foo2.ctid, foo2.*, foo2.tableoid, foo2.f1
                                  Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
    ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.*, foo.tableoid
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
          Hash Cond: (bar2.f1 = foo.f1)
          ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid, bar2.ctid
+               Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Hash
                Output: foo.ctid, foo.*, foo.tableoid, foo.f1
                ->  HashAggregate
@@ -7208,12 +7208,12 @@ update bar set f2 = f2 + 100
 from
   ( select f1 from foo union all select f1+3 from foo ) ss
 where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                           QUERY PLAN                                           
+------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
          Hash Cond: (foo.f1 = bar.f1)
@@ -7233,14 +7233,14 @@ where bar.f1 = ss.f1;
                ->  Seq Scan on public.bar
                      Output: bar.f1, bar.f2, bar.ctid
    ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.ctid, (ROW(foo.f1))
          Merge Cond: (bar2.f1 = foo.f1)
          ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid, bar2.ctid
                Sort Key: bar2.f1
                ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid, bar2.ctid
+                     Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Sort
                Output: (ROW(foo.f1)), foo.f1
                Sort Key: foo.f1
@@ -7438,17 +7438,17 @@ AFTER UPDATE OR DELETE ON bar2
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 explain (verbose, costs off)
 update bar set f2 = f2 + 100;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                               QUERY PLAN                                               
+--------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.f1, (bar.f2 + 100), bar.ctid
    ->  Foreign Scan on public.bar2
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.ctid, bar2.*
+         Remote SQL: SELECT f1, f2, f3, tableoid, ctid, ROW(f1, f2, f3) FROM public.loct2 FOR UPDATE
 (9 rows)
 
 update bar set f2 = f2 + 100;
@@ -7466,18 +7466,18 @@ NOTICE:  trig_row_after(23, skidoo) AFTER ROW UPDATE ON bar2
 NOTICE:  OLD: (7,277,77),NEW: (7,377,77)
 explain (verbose, costs off)
 delete from bar where f2 < 400;
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                   QUERY PLAN                                                   
+----------------------------------------------------------------------------------------------------------------
  Delete on public.bar
    Delete on public.bar
    Foreign Delete on public.bar2
-     Remote SQL: DELETE FROM public.loct2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: DELETE FROM public.loct2 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.ctid
          Filter: (bar.f2 < 400)
    ->  Foreign Scan on public.bar2
-         Output: bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
+         Output: bar2.ctid, bar2.ctid, bar2.*
+         Remote SQL: SELECT f2, tableoid, ctid, ROW(f1, f2, f3) FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
 (10 rows)
 
 delete from bar where f2 < 400;
@@ -7591,14 +7591,14 @@ SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, *
 -- random() causes non-direct foreign update
 EXPLAIN (VERBOSE, COSTS OFF)
      UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
-                                   QUERY PLAN                                    
----------------------------------------------------------------------------------
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
  Update on public.fp1
-   Remote SQL: UPDATE public.p1 SET b = $2 WHERE ctid = $1
+   Remote SQL: UPDATE public.p1 SET b = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.fp1
-         Output: a, (b + 1), ctid
+         Output: a, (b + 1), tableoid, ctid
          Filter: (random() <= '1'::double precision)
-         Remote SQL: SELECT a, b, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE
 (6 rows)
 
 UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
-- 
2.16.3


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Sorry, I sent older version, which is logically same but contains
some whitespace problems. I resend only 0003 by this mail.

At Fri, 24 Aug 2018 16:51:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180824.165131.45788857.horiguchi.kyotaro@lab.ntt.co.jp>
> Hello.
> 
> At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20180821.110132.261184472.horiguchi.kyotaro@lab.ntt.co.jp>
 
> > > You wrote:
> > > >    Several places seems to be assuming that fdw_scan_tlist may be
> > > >    used foreign scan on simple relation but I didn't find that
> > > >    actually happens.
> > > 
> > > Yeah, currently, postgres_fdw and file_fdw don't use that list for
> > > simple foreign table scans, but it could be used to improve the
> > > efficiency for those scans, as explained in fdwhandler.sgml:
> ...
> > I'll put more consideration on using fdw_scan_tlist in the
> > documented way.
> 
> Done. postgres_fdw now generates full fdw_scan_tlist (as
> documented) for foreign relations with junk columns, but a small
> change in core was needed. However it is far less invasive than
> the previous version and I believe that it dones't harm
> maybe-existing use of fdw_scan_tlist on non-join rels.
> 
> The previous patch didn't show "tableoid" in the Output list (as
> "<added_junk>") of explain output but this does correctly by
> referring to rte->eref->colnames. I believe no other FDW has
> expanded foreign relation even if it uses fdw_scan_tlist for
> ForeignScan on a base relation so it won't harm them.
> 
> Since this uses fdw_scan_tlist so it is theoretically
> back-patchable back to 9.6. This patch applies on top of the
> current master.
> 
> Please find the attached three files.
> 
> 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch
> 
>  This should fail for unpatched postgres_fdw. (Just for demonstration)
> 
> 0002-Core-side-modification-for-PgFDW-foreign-update-fix.patch
> 
>  Core side change which allows fdw_scan_tlist to have extra
>  columns that is not defined in the base relation.
> 
> 0003-Fix-of-foreign-update-bug-of-PgFDW.patch
> 
>  Fix of postgres_fdw for this problem.
> 
> 0004-Regtest-change-for-PgFDW-foreign-update-fix.patch
> 
>  Regression test change separated for readability.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From b95571ac7cf15101bfa045354a82befe074ecc55 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Aug 2018 16:17:24 +0900
Subject: [PATCH 3/4] Fix of foreign update bug of PgFDW

Postgres_fdw wrongly behavoes in updating foreign tables on a remote
partitioned table when direct modify is not used. This is because
postgres_fdw is forgetting that two different tuples with the same
ctid may come in the case. With this patch it uses remote tableoid in
addition to ctid to distinguish a remote tuple.
---
 contrib/postgres_fdw/deparse.c      | 149 +++++++++++-------
 contrib/postgres_fdw/postgres_fdw.c | 291 +++++++++++++++++++++++++++++++-----
 2 files changed, 344 insertions(+), 96 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6001f4d25e..c4cd6a7249 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1037,6 +1037,15 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
          */
         deparseExplicitTargetList(tlist, false, retrieved_attrs, context);
     }
+    else if (tlist != NIL)
+    {
+        /*
+         * The given tlist is that of base relation's expanded with junk
+         * columns.
+         */
+        context->params_list = NULL;
+        deparseExplicitTargetList(tlist, false, retrieved_attrs, context);
+    }
     else
     {
         /*
@@ -1088,6 +1097,42 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
     }
 }
 
+/*
+ * Adds one element in target/returning list if it is in attrs_used.
+ *
+ * If deparsestr is given, just use it. Otherwise resolves the name using rte.
+ */
+static inline void
+deparseAddTargetListItem(StringInfo buf,
+                         List **retrieved_attrs, Bitmapset *attrs_used,
+                         Index rtindex, AttrNumber attnum,
+                         char *deparsestr, RangeTblEntry *rte,
+                         bool is_returning, bool qualify_col,
+                         bool have_wholerow, bool *first)
+{
+    if (!have_wholerow &&
+        !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, attrs_used))
+        return;
+
+    if (!*first)
+        appendStringInfoString(buf, ", ");
+    else if (is_returning)
+        appendStringInfoString(buf, " RETURNING ");
+    *first = false;
+
+    if (deparsestr)
+    {
+        if (qualify_col)
+            ADD_REL_QUALIFIER(buf, rtindex);
+
+        appendStringInfoString(buf, deparsestr);
+    }
+    else
+        deparseColumnRef(buf, rtindex, attnum, rte, qualify_col);
+
+    *retrieved_attrs = lappend_int(*retrieved_attrs, attnum);
+}
+
 /*
  * Emit a target list that retrieves the columns specified in attrs_used.
  * This is used for both SELECT and RETURNING targetlists; the is_returning
@@ -1128,58 +1173,28 @@ deparseTargetList(StringInfo buf,
         if (attr->attisdropped)
             continue;
 
-        if (have_wholerow ||
-            bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-                          attrs_used))
-        {
-            if (!first)
-                appendStringInfoString(buf, ", ");
-            else if (is_returning)
-                appendStringInfoString(buf, " RETURNING ");
-            first = false;
-
-            deparseColumnRef(buf, rtindex, i, rte, qualify_col);
-
-            *retrieved_attrs = lappend_int(*retrieved_attrs, i);
-        }
+        deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                                 rtindex, i, NULL, rte,
+                                 is_returning, qualify_col, have_wholerow,
+                                 &first);
     }
 
     /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
+     * Add ctid, oid and tableoid if needed. The attribute name and number are
+     * assigned in postgresAddForeignUpdateTargets. We currently don't support
+     * retrieving any other system columns.
      */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, tupdesc->natts + 1, "tableoid",
+                             NULL, is_returning, qualify_col, false, &first);
 
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, SelfItemPointerAttributeNumber, "ctid",
+                             NULL, is_returning, qualify_col, false, &first);
 
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
-    }
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, ObjectIdAttributeNumber, "oid",
+                             NULL, is_returning, qualify_col, false, &first);
 
     /* Don't generate bad syntax if no undropped columns */
     if (first && !is_returning)
@@ -1728,7 +1743,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;                    /* tableoid and ctid always precede */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1742,7 +1757,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1858,7 +1873,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
@@ -2160,9 +2175,11 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
     }
     else
     {
-        char       *colname = NULL;
+        char *colname = NULL;
         List       *options;
         ListCell   *lc;
+        Relation rel;
+        int natts;
 
         /* varno must not be any of OUTER_VAR, INNER_VAR and INDEX_VAR. */
         Assert(!IS_SPECIAL_VARNO(varno));
@@ -2171,16 +2188,34 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * If it's a column of a foreign table, and it has the column_name FDW
          * option, use that value.
          */
-        options = GetForeignColumnOptions(rte->relid, varattno);
-        foreach(lc, options)
-        {
-            DefElem    *def = (DefElem *) lfirst(lc);
+        rel = heap_open(rte->relid, NoLock);
+        natts = RelationGetNumberOfAttributes(rel);
+        heap_close(rel, NoLock);
 
-            if (strcmp(def->defname, "column_name") == 0)
+        if (rte->relkind == RELKIND_FOREIGN_TABLE)
+        {
+            if (varattno > 0 && varattno <= natts)
             {
-                colname = defGetString(def);
-                break;
+                options = GetForeignColumnOptions(rte->relid, varattno);
+                foreach(lc, options)
+                {
+                    DefElem    *def = (DefElem *) lfirst(lc);
+
+                    if (strcmp(def->defname, "column_name") == 0)
+                    {
+                        colname = defGetString(def);
+                        break;
+                    }
+                }
             }
+            else if (varattno == natts + 1)
+            {
+                /* This should be an additional junk column */
+                colname = "tableoid";
+            }
+            else
+                elog(ERROR, "name resolution failed for attribute %d of relation %u",
+                     varattno, rte->relid);
         }
 
         /*
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0803c30a48..babf5a49d4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -179,6 +179,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    AttrNumber    toidAttno;        /* attnum of input resjunk tableoid column */
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -283,6 +284,12 @@ static void postgresGetForeignRelSize(PlannerInfo *root,
 static void postgresGetForeignPaths(PlannerInfo *root,
                         RelOptInfo *baserel,
                         Oid foreigntableid);
+static List *generate_scan_tlist_for_relation(PlannerInfo *root,
+                                              RelOptInfo *foreignrel,
+                                              Oid foreigntableoid,
+                                              PgFdwRelationInfo *fpinfo,
+                                              List *tlist,
+                                              List *recheck_quals);
 static ForeignScan *postgresGetForeignPlan(PlannerInfo *root,
                        RelOptInfo *foreignrel,
                        Oid foreigntableid,
@@ -392,6 +399,7 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       List *retrieved_attrs);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -1117,6 +1125,109 @@ postgresGetForeignPaths(PlannerInfo *root,
     }
 }
 
+/*
+ * generate_scan_tlist_for_relation :
+ *    Constructs fdw_scan_tlist from the followig sources.
+ *
+ * We may have appended tableoid and ctid junk columns to the parse
+ * targetlist. We need to give alternative scan tlist to planner in the
+ * case. This function returns the tlist consists of the following attributes
+ * in the order.
+ *
+ * 1. Relation attributes requested by user and needed for recheck
+ *        fpinfo->attrs_used, fdw_recheck_quals and given tlist.
+ * 2. Junk columns and others in root->processed_tlist which are not added by 1
+ *
+ * If no junk column exists, returns NIL.
+ */
+static List *
+generate_scan_tlist_for_relation(PlannerInfo *root,
+                                 RelOptInfo *foreignrel, Oid foreigntableoid,
+                                 PgFdwRelationInfo *fpinfo,
+                                 List *tlist, List *recheck_quals)
+{
+    Index        frelid = foreignrel->relid;
+    List       *fdw_scan_tlist = NIL;
+    Relation    frel;
+    int            base_nattrs;
+    ListCell   *lc;
+    Bitmapset *attrs = NULL;
+    int attnum;
+
+    /*
+     * RelOptInfo has expanded number of attributes. Check it against the base
+     * relations's attribute number to determine the necessity for alternative
+     * scan target list.
+     */
+    frel = heap_open(foreigntableoid, NoLock);
+    base_nattrs = RelationGetNumberOfAttributes(frel);
+    heap_close(frel, NoLock);
+
+    if (base_nattrs == foreignrel->max_attr)
+        return NIL;
+
+    /* We have junk columns. Construct alternative scan target list. */
+
+    /* collect needed relation attributes */
+    attrs = bms_copy(fpinfo->attrs_used);
+    pull_varattnos((Node *)recheck_quals, frelid, &attrs);
+    pull_varattnos((Node *)tlist, frelid, &attrs);
+
+    /* Add relation's attributes  */
+    while ((attnum = bms_first_member(attrs)) >= 0)
+    {
+        TargetEntry *tle;
+        Form_pg_attribute attr;
+        Var *var;
+        char *name = NULL;
+
+        attnum += FirstLowInvalidHeapAttributeNumber;
+        if (attnum < 1)
+            continue;
+        if (attnum > base_nattrs)
+            break;
+
+        attr = TupleDescAttr(frel->rd_att, attnum - 1);
+        if (attr->attisdropped)
+            var = (Var *) makeNullConst(INT4OID, -1, InvalidOid);
+        else
+        {
+            var = makeVar(frelid, attnum,
+                          attr->atttypid, attr->atttypmod,
+                          attr->attcollation, 0);
+            name = pstrdup(NameStr(attr->attname));
+        }
+
+        tle = makeTargetEntry((Expr *)var,
+                              list_length(fdw_scan_tlist) + 1,
+                              name,
+                              false);
+        fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+    }
+
+    /* Add junk attributes  */
+    foreach (lc, root->processed_tlist)
+    {
+        TargetEntry *tle = lfirst_node(TargetEntry, lc);
+        Var *var = (Var *) tle->expr;
+
+        /*
+         * We aren't interested in non Vars, vars of other rels and base
+         * attributes.
+         */
+        if (IsA(var, Var) && var->varno == frelid &&
+            (var->varattno > base_nattrs || var->varattno < 1))
+        {
+            Assert(tle->resjunk);
+            tle = copyObject(tle);
+            tle->resno = list_length(fdw_scan_tlist) + 1;
+            fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+        }
+    }
+
+    return fdw_scan_tlist;
+}
+
 /*
  * postgresGetForeignPlan
  *        Create ForeignScan plan node which implements selected best path
@@ -1140,10 +1251,11 @@ postgresGetForeignPlan(PlannerInfo *root,
     List       *fdw_recheck_quals = NIL;
     List       *retrieved_attrs;
     StringInfoData sql;
-    ListCell   *lc;
 
     if (IS_SIMPLE_REL(foreignrel))
     {
+        ListCell *lc;
+
         /*
          * For base relations, set scan_relid as the relid of the relation.
          */
@@ -1191,6 +1303,17 @@ postgresGetForeignPlan(PlannerInfo *root,
          * should recheck all the remote quals.
          */
         fdw_recheck_quals = remote_exprs;
+
+        /*
+         * We may have put tableoid and ctid as junk columns to the
+         * targetlist. Generate fdw_scan_tlist in the case.
+         */
+        fdw_scan_tlist = generate_scan_tlist_for_relation(root,
+                                                          foreignrel,
+                                                          foreigntableid,
+                                                          fpinfo,
+                                                          tlist,
+                                                          fdw_recheck_quals);
     }
     else
     {
@@ -1383,16 +1506,12 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
      * into local representation and error reporting during that process.
      */
     if (fsplan->scan.scanrelid > 0)
-    {
         fsstate->rel = node->ss.ss_currentRelation;
-        fsstate->tupdesc = RelationGetDescr(fsstate->rel);
-    }
     else
-    {
         fsstate->rel = NULL;
-        fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
-    }
 
+    /* Always use the tuple descriptor privided by core */
+    fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
     /*
@@ -1543,22 +1662,30 @@ postgresAddForeignUpdateTargets(Query *parsetree,
     Var           *var;
     const char *attrname;
     TargetEntry *tle;
+    int            varattno = RelationGetNumberOfAttributes(target_relation) + 1;
 
     /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
+     * In postgres_fdw, what we need is the tableoid and ctid, same as for a
+     * regular table.
      */
 
-    /* Make a Var representing the desired value */
+    /*
+     * Table OID is needed to retrieved as a non-system junk column in the
+     * returning tuple. We add it as a column after all regular columns.
+     */
+    attrname = "tableoid";
     var = makeVar(parsetree->resultRelation,
-                  SelfItemPointerAttributeNumber,
-                  TIDOID,
+                  varattno++,
+                  OIDOID,
                   -1,
                   InvalidOid,
                   0);
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
-
+    /*
+     * Wrap it in a resjunk TLE with a name accessible later by FDW. Doesn't
+     * seem that we explicitly free this tle but give pstrdup'ed string here
+     * just in case.
+     */
     tle = makeTargetEntry((Expr *) var,
                           list_length(parsetree->targetList) + 1,
                           pstrdup(attrname),
@@ -1566,6 +1693,29 @@ postgresAddForeignUpdateTargets(Query *parsetree,
 
     /* ... and add it to the query's targetlist */
     parsetree->targetList = lappend(parsetree->targetList, tle);
+
+    /* ... also needs to have colname entry */
+    target_rte->eref->colnames =
+        lappend(target_rte->eref->colnames, makeString(pstrdup(attrname)));
+
+
+    /* Do the same for ctid */
+    attrname = "ctid";
+    var = makeVar(parsetree->resultRelation,
+                  SelfItemPointerAttributeNumber,
+                  TIDOID,
+                  -1,
+                  InvalidOid,
+                  0);
+
+    tle = makeTargetEntry((Expr *) var,
+                          list_length(parsetree->targetList) + 1,
+                          pstrdup(attrname),
+                          true);
+
+    parsetree->targetList = lappend(parsetree->targetList, tle);
+    target_rte->eref->colnames =
+        lappend(target_rte->eref->colnames, makeString(pstrdup(attrname)));
 }
 
 /*
@@ -1769,7 +1919,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1824,7 +1974,7 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        toiddatum, ctiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1835,17 +1985,26 @@ postgresExecForeignUpdate(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1900,7 +2059,7 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        ctiddatum, toiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1911,17 +2070,26 @@ postgresExecForeignDelete(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2303,6 +2471,28 @@ postgresPlanDirectModify(PlannerInfo *root,
                                                    returningList);
     }
 
+    /*
+     * The junk columns in the targetlist is no longer needed for FDW direct
+     * moidfy. Strip them so that the planner doesn't bother.
+     */
+    if (fscan->scan.scanrelid > 0 && fscan->fdw_scan_tlist != NIL)
+    {
+        List *newtlist = NIL;
+        ListCell *lc;
+
+        fscan->fdw_scan_tlist = NIL;
+        foreach (lc, subplan->targetlist)
+        {
+            TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+            /* once found junk, all the rest are also junk */
+            if (tle->resjunk)
+                continue;
+            newtlist = lappend(newtlist, tle);
+        }
+        subplan->targetlist = newtlist;
+    }
+
     /*
      * Construct the SQL command string.
      */
@@ -2349,7 +2539,7 @@ postgresPlanDirectModify(PlannerInfo *root,
     /*
      * Update the foreign-join-related fields.
      */
-    if (fscan->scan.scanrelid == 0)
+    if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
     {
         /* No need for the outer subplan. */
         fscan->scan.plan.lefttree = NULL;
@@ -3345,7 +3535,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3353,13 +3543,24 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
+        /* Find the remote tableoid resjunk column in the subplan's result */
+        fmstate->toidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
+                                                          "tableoid");
+        if (!AttributeNumberIsValid(fmstate->toidAttno))
+            elog(ERROR, "could not find junk tableoid column");
+
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
         /* Find the ctid resjunk column in the subplan's result */
         fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
                                                           "ctid");
         if (!AttributeNumberIsValid(fmstate->ctidAttno))
             elog(ERROR, "could not find junk ctid column");
 
-        /* First transmittable parameter will be ctid */
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3442,6 +3643,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3453,10 +3655,15 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
+        Assert (tupleid != NULL);
+
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -3685,8 +3892,8 @@ rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist)
         new_tlist = lappend(new_tlist,
                             makeTargetEntry(tle->expr,
                                             list_length(new_tlist) + 1,
-                                            NULL,
-                                            false));
+                                            tle->resname,
+                                            tle->resjunk));
     }
     fscan->fdw_scan_tlist = new_tlist;
 }
@@ -5576,12 +5783,18 @@ make_tuple_from_result_row(PGresult *res,
      */
     oldcontext = MemoryContextSwitchTo(temp_context);
 
-    if (rel)
-        tupdesc = RelationGetDescr(rel);
+    /*
+     * If fdw_scan_tlist is provided for base relation, use the tuple
+     * descriptor given from planner.
+     */
+    if (!rel ||
+        (fsstate &&
+         castNode(ForeignScan, fsstate->ss.ps.plan)->fdw_scan_tlist != NULL))
+        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     else
     {
-        Assert(fsstate);
-        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+        Assert(rel);
+        tupdesc = RelationGetDescr(rel);
     }
 
     values = (Datum *) palloc0(tupdesc->natts * sizeof(Datum));
@@ -5623,7 +5836,7 @@ make_tuple_from_result_row(PGresult *res,
         errpos.cur_attno = i;
         if (i > 0)
         {
-            /* ordinary column */
+            /* ordinary column and tableoid */
             Assert(i <= tupdesc->natts);
             nulls[i - 1] = (valstr == NULL);
             /* Apply the input function even to nulls, to support domains */
-- 
2.16.3


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/21 11:01), Kyotaro HORIGUCHI wrote:
> At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote
in<5B72C1AE.8010408@lab.ntt.co.jp>
>> (2018/08/09 22:04), Etsuro Fujita wrote:
>>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:

>> I spent more time looking at the patch.  ISTM that the patch well
>> suppresses the effect of the tuple-descriptor expansion by making
>> changes to code in the planner and executor (and ruleutils.c), but I'm
>> still not sure that the patch is the right direction to go in, because
>> ISTM that expanding the tuple descriptor on the fly might be a wart.

> The exapansion should be safe if the expanded descriptor has the
> same defitions for base columns and all the extended coulumns are
> junks. The junk columns should be ignored by unrelated nodes and
> they are passed safely as far as ForeignModify passes tuples as
> is from underlying ForeignScan to ForeignUpdate/Delete.

I'm not sure that would be really safe.  Does that work well when 
EvalPlanQual, for example?

>> You wrote:
>>>     Several places seems to be assuming that fdw_scan_tlist may be
>>>     used foreign scan on simple relation but I didn't find that
>>>     actually happens.
>>
>> Yeah, currently, postgres_fdw and file_fdw don't use that list for
>> simple foreign table scans, but it could be used to improve the
>> efficiency for those scans, as explained in fdwhandler.sgml:
>>
>>       Another<structname>ForeignScan</structname>  field that can be filled
>>       by FDWs
>>       is<structfield>fdw_scan_tlist</structfield>, which describes the
>>       tuples returned by
>>       the FDW for this plan node.  For simple foreign table scans this can
>>       be
>>       set to<literal>NIL</literal>, implying that the returned tuples have
>>       the
>>       row type declared for the foreign table.  A non-<symbol>NIL</symbol>
>>       value must be a
>>       target list (list of<structname>TargetEntry</structname>s) containing
>>       Vars and/or
>>       expressions representing the returned columns.  This might be used,
>>       for
>>       example, to show that the FDW has omitted some columns that it noticed
>>       won't be needed for the query.  Also, if the FDW can compute
>>       expressions
>>       used by the query more cheaply than can be done locally, it could add
>>       those expressions to<structfield>fdw_scan_tlist</structfield>. Note
>>       that join
>>       plans (created from paths made by
>>       <function>GetForeignJoinPaths</function>) must
>>       always supply<structfield>fdw_scan_tlist</structfield>  to describe
>>       the set of
>>       columns they will return.
>
> https://www.postgresql.org/docs/devel/static/fdw-planning.html
>
> Hmm. Thanks for the pointer, it seems to need rewrite. However,
> it doesn't seem to work for non-join foreign scans, since the
> core igonres it and uses local table definition.

Really?

>> You wrote:
>>> I'm not sure whether the following ponits are valid.
>>>
>>> - If fdw_scan_tlist is used for simple relation scans, this would
>>>     break the case. (ExecInitForeignScan,  set_foreignscan_references)
>>
>> Some FDWs might already use that list for the improved efficiency for
>> simple foreign table scans as explained above, so we should avoid
>> breaking that.
>
> I considered to use fdw_scan_tlist in that way but the core is
> assuming that foreign scans with scanrelid>  0 uses the relation
> descriptor.

Could you elaborate a bit more on this?

> Do you have any example for that?

I don't know such an example, but in my understanding, the core allows 
the FDW to do that.

>> If we take the Param-based approach suggested by Tom, I suspect there
>> would be no need to worry about at least those things, so I'll try to
>> update your patch as such, if there are no objections from you (or
>> anyone else).

> PARAM_EXEC is single storage side channel that can work as far as
> it is set and read while each tuple is handled. In this case
> postgresExecForeignUpdate/Delete must be called before
> postgresIterateForeignScan returns the next tuple. An apparent
> failure case for this usage is the join-update case below.
>
> https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyotaro@lab.ntt.co.jp

What I have in mind would be to 1) create a tlist that contains not only 
Vars/PHVs but Params, for each join rel involving the target rel so we 
ensure that the Params will propagate up through all join plan steps, 
and 2) convert a join rel's tlist Params into Vars referencing the same 
Params in the tlists for the outer/inner rels, by setrefs.c.  I think 
that would probably work well even for the case you mentioned above. 
Maybe I'm missing something, though.

Sorry for the delay.

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5B7FFDEF.6020302@lab.ntt.co.jp>
> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote:
> > At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro
> > Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
> > in<5B72C1AE.8010408@lab.ntt.co.jp>
> >> (2018/08/09 22:04), Etsuro Fujita wrote:
> >>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
> 
> >> I spent more time looking at the patch.  ISTM that the patch well
> >> suppresses the effect of the tuple-descriptor expansion by making
> >> changes to code in the planner and executor (and ruleutils.c), but I'm
> >> still not sure that the patch is the right direction to go in, because
> >> ISTM that expanding the tuple descriptor on the fly might be a wart.
> 
> > The exapansion should be safe if the expanded descriptor has the
> > same defitions for base columns and all the extended coulumns are
> > junks. The junk columns should be ignored by unrelated nodes and
> > they are passed safely as far as ForeignModify passes tuples as
> > is from underlying ForeignScan to ForeignUpdate/Delete.
> 
> I'm not sure that would be really safe.  Does that work well when
> EvalPlanQual, for example?

Nothing. The reason was that core just doesn't know about the
extended portion. So only problematic case was
ExprEvalWholeRowVar, where explicit sanity check is
perfomed. But, I think it is a ugly wart as you said. So the
latest patch generates full fdw_scan_tlist.

> > https://www.postgresql.org/docs/devel/static/fdw-planning.html
> >
> > Hmm. Thanks for the pointer, it seems to need rewrite. However,
> > it doesn't seem to work for non-join foreign scans, since the
> > core igonres it and uses local table definition.
> 
> Really?

No, I was wrong here. The core doesn't consider the case where
fdw_scan_tlist has attributes that is not a part of base relation
but it doesn't affect the description.

> >> You wrote:
> >>> I'm not sure whether the following ponits are valid.
> >>>
> >>> - If fdw_scan_tlist is used for simple relation scans, this would
> >>>     break the case. (ExecInitForeignScan,  set_foreignscan_references)
> >>
> >> Some FDWs might already use that list for the improved efficiency for
> >> simple foreign table scans as explained above, so we should avoid
> >> breaking that.
> >
> > I considered to use fdw_scan_tlist in that way but the core is
> > assuming that foreign scans with scanrelid>  0 uses the relation
> > descriptor.
> 
> Could you elaborate a bit more on this?

After all I found that core uses fdw_scan_tlist if any and the
attached patch doen't modify the "affected" part. Sorry, it's
still hot here:p

> > Do you have any example for that?
> 
> I don't know such an example, but in my understanding, the core allows
> the FDW to do that.

As above, I agreed. Sorry for the bogosity.

> >> If we take the Param-based approach suggested by Tom, I suspect there
> >> would be no need to worry about at least those things, so I'll try to
> >> update your patch as such, if there are no objections from you (or
> >> anyone else).
> 
> > PARAM_EXEC is single storage side channel that can work as far as
> > it is set and read while each tuple is handled. In this case
> > postgresExecForeignUpdate/Delete must be called before
> > postgresIterateForeignScan returns the next tuple. An apparent
> > failure case for this usage is the join-update case below.
> >
> > https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyotaro@lab.ntt.co.jp
> 
> What I have in mind would be to 1) create a tlist that contains not
> only Vars/PHVs but Params, for each join rel involving the target rel
> so we ensure that the Params will propagate up through all join plan
> steps, and 2) convert a join rel's tlist Params into Vars referencing
> the same Params in the tlists for the outer/inner rels, by setrefs.c.
> I think that would probably work well even for the case you mentioned
> above. Maybe I'm missing something, though.

As I wrote above, the problem was not param id propagation but
the per-query storage for a parameter holded in econtext.

PARAM_EXEC is assumed to be used between outer and inner
relations of a nestloop or retrieval from sub-query retrieval as
commented in primnodes.h.

>    PARAM_EXEC:  The parameter is an internal executor parameter, used
>        for passing values into and out of sub-queries or from
>        nestloop joins to their inner scans.
>        For historical reasons, such parameters are numbered from 0.
>        These numbers are independent of PARAM_EXTERN numbers.

Anyway the odds are high that I'm missing far more than you.

> Sorry for the delay.

Nope. Thank you for the comment and I'm waiting for the patch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/30 20:37), Kyotaro HORIGUCHI wrote:
> At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote
in<5B7FFDEF.6020302@lab.ntt.co.jp>
>> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote:
>>> At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro
>>> Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote
>>> in<5B72C1AE.8010408@lab.ntt.co.jp>
>>>> (2018/08/09 22:04), Etsuro Fujita wrote:
>>>>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
>>
>>>> I spent more time looking at the patch.  ISTM that the patch well
>>>> suppresses the effect of the tuple-descriptor expansion by making
>>>> changes to code in the planner and executor (and ruleutils.c), but I'm
>>>> still not sure that the patch is the right direction to go in, because
>>>> ISTM that expanding the tuple descriptor on the fly might be a wart.
>>
>>> The exapansion should be safe if the expanded descriptor has the
>>> same defitions for base columns and all the extended coulumns are
>>> junks. The junk columns should be ignored by unrelated nodes and
>>> they are passed safely as far as ForeignModify passes tuples as
>>> is from underlying ForeignScan to ForeignUpdate/Delete.
>>
>> I'm not sure that would be really safe.  Does that work well when
>> EvalPlanQual, for example?
>
> Nothing. The reason was that core just doesn't know about the
> extended portion. So only problematic case was
> ExprEvalWholeRowVar, where explicit sanity check is
> perfomed. But, I think it is a ugly wart as you said. So the
> latest patch generates full fdw_scan_tlist.

Will review.

>>>> If we take the Param-based approach suggested by Tom, I suspect there
>>>> would be no need to worry about at least those things, so I'll try to
>>>> update your patch as such, if there are no objections from you (or
>>>> anyone else).
>>
>>> PARAM_EXEC is single storage side channel that can work as far as
>>> it is set and read while each tuple is handled. In this case
>>> postgresExecForeignUpdate/Delete must be called before
>>> postgresIterateForeignScan returns the next tuple. An apparent
>>> failure case for this usage is the join-update case below.
>>>
>>> https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyotaro@lab.ntt.co.jp
>>
>> What I have in mind would be to 1) create a tlist that contains not
>> only Vars/PHVs but Params, for each join rel involving the target rel
>> so we ensure that the Params will propagate up through all join plan
>> steps, and 2) convert a join rel's tlist Params into Vars referencing
>> the same Params in the tlists for the outer/inner rels, by setrefs.c.
>> I think that would probably work well even for the case you mentioned
>> above. Maybe I'm missing something, though.
>
> As I wrote above, the problem was not param id propagation but
> the per-query storage for a parameter holded in econtext.
>
> PARAM_EXEC is assumed to be used between outer and inner
> relations of a nestloop or retrieval from sub-query retrieval as
> commented in primnodes.h.
>
>>     PARAM_EXEC:  The parameter is an internal executor parameter, used
>>         for passing values into and out of sub-queries or from
>>         nestloop joins to their inner scans.
>>         For historical reasons, such parameters are numbered from 0.
>>         These numbers are independent of PARAM_EXTERN numbers.

Yeah, but IIUC, I think that #2 would allow us to propagate up the param 
values, not the param ids.

> I'm waiting for the patch.

OK, but I will review your patch first.

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/30 21:58), Etsuro Fujita wrote:
> (2018/08/30 20:37), Kyotaro HORIGUCHI wrote:
>> At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro
>> Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
>> in<5B7FFDEF.6020302@lab.ntt.co.jp>
>>> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote:
>>>> At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro
>>>> Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
>>>> in<5B72C1AE.8010408@lab.ntt.co.jp>
>>>>> (2018/08/09 22:04), Etsuro Fujita wrote:
>>>>>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
>>>
>>>>> I spent more time looking at the patch. ISTM that the patch well
>>>>> suppresses the effect of the tuple-descriptor expansion by making
>>>>> changes to code in the planner and executor (and ruleutils.c), but I'm
>>>>> still not sure that the patch is the right direction to go in, because
>>>>> ISTM that expanding the tuple descriptor on the fly might be a wart.
>>>
>>>> The exapansion should be safe if the expanded descriptor has the
>>>> same defitions for base columns and all the extended coulumns are
>>>> junks. The junk columns should be ignored by unrelated nodes and
>>>> they are passed safely as far as ForeignModify passes tuples as
>>>> is from underlying ForeignScan to ForeignUpdate/Delete.
>>>
>>> I'm not sure that would be really safe. Does that work well when
>>> EvalPlanQual, for example?

I was wrong here; I assumed here that we supported late locking for an 
UPDATE or DELETE on a foreign table, and I was a bit concerned that the 
approach you proposed might not work well with EvalPlanQual, but as 
described in fdwhandler.sgml, the core doesn't support for that:

      For an <command>UPDATE</command> or <command>DELETE</command> on a 
foreign table, it
      is recommended that the <literal>ForeignScan</literal> operation 
on the target
      table perform early locking on the rows that it fetches, perhaps 
via the
      equivalent of <command>SELECT FOR UPDATE</command>.  An FDW can 
detect whether
      a table is an <command>UPDATE</command>/<command>DELETE</command> 
target at plan time
      by comparing its relid to 
<literal>root->parse->resultRelation</literal>,
      or at execution time by using 
<function>ExecRelationIsTargetRelation()</function>.
      An alternative possibility is to perform late locking within the
      <function>ExecForeignUpdate</function> or 
<function>ExecForeignDelete</function>
      callback, but no special support is provided for this.

So, there would be no need to consider about EvalPlanQual.  Sorry for 
the noise.

Best regards,
Etsuro Fujita


Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Wed, 05 Sep 2018 20:02:04 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5B8FB7AC.5020003@lab.ntt.co.jp>
> (2018/08/30 21:58), Etsuro Fujita wrote:
> > (2018/08/30 20:37), Kyotaro HORIGUCHI wrote:
> >> At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro
> >> Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
> >> in<5B7FFDEF.6020302@lab.ntt.co.jp>
> >>> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote:
> >>>> At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro
> >>>> Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
> >>>> in<5B72C1AE.8010408@lab.ntt.co.jp>
> >>>>> (2018/08/09 22:04), Etsuro Fujita wrote:
> >>>>>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote:
> >>>
> >>>>> I spent more time looking at the patch. ISTM that the patch well
> >>>>> suppresses the effect of the tuple-descriptor expansion by making
> >>>>> changes to code in the planner and executor (and ruleutils.c), but I'm
> >>>>> still not sure that the patch is the right direction to go in, because
> >>>>> ISTM that expanding the tuple descriptor on the fly might be a wart.
> >>>
> >>>> The exapansion should be safe if the expanded descriptor has the
> >>>> same defitions for base columns and all the extended coulumns are
> >>>> junks. The junk columns should be ignored by unrelated nodes and
> >>>> they are passed safely as far as ForeignModify passes tuples as
> >>>> is from underlying ForeignScan to ForeignUpdate/Delete.
> >>>
> >>> I'm not sure that would be really safe. Does that work well when
> >>> EvalPlanQual, for example?
> 
> I was wrong here; I assumed here that we supported late locking for an
> UPDATE or DELETE on a foreign table, and I was a bit concerned that
> the approach you proposed might not work well with EvalPlanQual, but
> as described in fdwhandler.sgml, the core doesn't support for that:
> 
>      For an <command>UPDATE</command> or <command>DELETE</command> on a
>      foreign table, it
>      is recommended that the <literal>ForeignScan</literal> operation on
>      the target
>      table perform early locking on the rows that it fetches, perhaps via
>      the
>      equivalent of <command>SELECT FOR UPDATE</command>.  An FDW can detect
>      whether
>      a table is an <command>UPDATE</command>/<command>DELETE</command>
>      target at plan time
>      by comparing its relid to
>      <literal>root->parse->resultRelation</literal>,
>      or at execution time by using
>      <function>ExecRelationIsTargetRelation()</function>.
>      An alternative possibility is to perform late locking within the
>      <function>ExecForeignUpdate</function> or
>      <function>ExecForeignDelete</function>
>      callback, but no special support is provided for this.
> 
> So, there would be no need to consider about EvalPlanQual.  Sorry for
> the noise.

I don't think it is a noise at all. Thank you for the pointer.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/08/24 16:58), Kyotaro HORIGUCHI wrote:
> At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI<horiguchi.kyotaro@lab.ntt.co.jp>  wrote
in<20180821.110132.261184472.horiguchi.kyotaro@lab.ntt.co.jp>
>>> You wrote:
>>>>     Several places seems to be assuming that fdw_scan_tlist may be
>>>>     used foreign scan on simple relation but I didn't find that
>>>>     actually happens.
>>>
>>> Yeah, currently, postgres_fdw and file_fdw don't use that list for
>>> simple foreign table scans, but it could be used to improve the
>>> efficiency for those scans, as explained in fdwhandler.sgml:
> ...
>> I'll put more consideration on using fdw_scan_tlist in the
>> documented way.
>
> Done. postgres_fdw now generates full fdw_scan_tlist (as
> documented) for foreign relations with junk columns having a
> small change in core side. However it is far less invasive than
> the previous version and I believe that it dones't harm
> maybe-existing use of fdw_scan_tlist on non-join rels (that is,
> in the case of a subset of relation columns).

Yeah, changes to the core by the new version is really small, which is 
great, but I'm not sure it's a good idea to modify the catalog info on 
the target table on the fly:

@@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid 
relationObjectId,\
  bool inhparent,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot access temporary or unlogged relations 
during r\
ecovery")));

+   max_attrnum = RelationGetNumberOfAttributes(relation);
+
+   /* Foreign table may have exanded this relation with junk columns */
+   if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+   {
+       AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
+       if (max_attrnum < maxattno)
+           max_attrnum = maxattno;
+   }
+
     rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
-   rel->max_attr = RelationGetNumberOfAttributes(relation);
+   rel->max_attr = max_attrnum;
     rel->reltablespace = RelationGetForm(relation)->reltablespace;

This breaks the fundamental assumption that rel->max_attr is equal to 
RelationGetNumberOfAttributes of that table.  My concern is: this change 
would probably be a wart, so it would be bug-prone in future versions.

Another thing on the new version:

@@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, 
RelOptInfo *rel)
             relation = heap_open(rte->relid, NoLock);

             numattrs = RelationGetNumberOfAttributes(relation);
+
+           /*
+            * Foreign tables may have expanded with some junk columns. Punt
+            * in the case.
+            */
+           if (numattrs < rel->max_attr)
+           {
+               Assert(root->simple_rte_array[rel->relid]->relkind ==
+                      RELKIND_FOREIGN_TABLE);
+               heap_close(relation, NoLock);
+               break;
+           }

I think this would disable the optimization on projection in foreign 
scans, causing performance regression.

> One arguable behavior change is about wholrow vars. Currently it
> refferes local tuple with all columns but it is explicitly
> fetched as ROW() after this patch applied. This could be fixed
> but not just now.
>
> Part of 0004-:
> -  Output: f1, ''::text, ctid, rem1.*
> -  Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
> +  Output: f1, ''::text, tableoid, ctid, rem1.*
> +  Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE

That would be also performance regression.  If we go in this direction, 
that should be fixed.

> Since this uses fdw_scan_tlist so it is theoretically
> back-patchable back to 9.6.

IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join 
pushdown infrastructure, so I think your patch can be back-patched to 
PG9.5, but I don't think that's enough; IIRC, this issue was introduced 
in PG9.3, so a solution for this should be back-patch-able to PG9.3, I 
think.

> Please find the attached three files.

Thanks for the patches!

> 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch
>
>   This should fail for unpatched postgres_fdw. (Just for demonstration)

+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS 
toiddiff, ctid, * FROM fp1;

Does it make sense to evaluate toiddiff?  I think that should always be 0.

> 0003-Fix-of-foreign-update-bug-of-PgFDW.patch
>
>   Fix of postgres_fdw for this problem.

Sorry, I have not looked at it closely yet, but before that I'd like to 
discuss the direction we go in.  I'm not convinced that your approach is 
the right direction, so as promised, I wrote a patch using the 
Param-based approach, and compared the two approaches.  Attached is a 
WIP patch for that, which includes the 0003 patch.  I don't think there 
would be any warts as discussed above in the Param-based approach for 
now.  (That approach modifies the planner so that the targetrel's tlist 
would contain Params as well as Vars/PHVs, so actually, it breaks the 
planner assumption that a rel's tlist would only include Vars/PHVs, but 
I don't find any issues on that at least for now.  Will look into that 
in more detail.)  And I don't think there would be any concern about 
performance regression, either.  Maybe I'm missing something, though.

What do you think about that?

Note about the attached: I tried to invent a utility for 
generate_new_param like SS_make_initplan_output_param as mentioned in 
[1], but since the FDW API doesn't pass PlannerInfo to the FDW, I think 
the FDW can't call the utility the same way.  Instead, I modified the 
planner so that 1) the FDW adds Params without setting PARAM_EXEC Param 
IDs using a new function, and then 2) the core fixes the IDs.

Sorry for the delay.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/3919.1527775582%40sss.pgh.pa.us

Вложения

Re: Problem while updating a foreign table pointing to apartitioned table on foreign server

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5B9BB133.1060107@lab.ntt.co.jp>
> (2018/08/24 16:58), Kyotaro HORIGUCHI wrote:
> > At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro
> > HORIGUCHI<horiguchi.kyotaro@lab.ntt.co.jp> wrote
> > in<20180821.110132.261184472.horiguchi.kyotaro@lab.ntt.co.jp>
> >>> You wrote:
> >>>>     Several places seems to be assuming that fdw_scan_tlist may be
> >>>>     used foreign scan on simple relation but I didn't find that
> >>>>     actually happens.
> >>>
> >>> Yeah, currently, postgres_fdw and file_fdw don't use that list for
> >>> simple foreign table scans, but it could be used to improve the
> >>> efficiency for those scans, as explained in fdwhandler.sgml:
> > ...
> >> I'll put more consideration on using fdw_scan_tlist in the
> >> documented way.
> >
> > Done. postgres_fdw now generates full fdw_scan_tlist (as
> > documented) for foreign relations with junk columns having a
> > small change in core side. However it is far less invasive than
> > the previous version and I believe that it dones't harm
> > maybe-existing use of fdw_scan_tlist on non-join rels (that is,
> > in the case of a subset of relation columns).
> 
> Yeah, changes to the core by the new version is really small, which is
> great, but I'm not sure it's a good idea to modify the catalog info on
> the target table on the fly:
> 
> @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid
> relationObjectId,\
>  bool inhparent,
>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  errmsg("cannot access temporary or unlogged relations during
>                  r\
> ecovery")));
> 
> +   max_attrnum = RelationGetNumberOfAttributes(relation);
> +
> + /* Foreign table may have exanded this relation with junk columns */
> + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
> +   {
> + AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
> +       if (max_attrnum < maxattno)
> +           max_attrnum = maxattno;
> +   }
> +
>     rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
> -   rel->max_attr = RelationGetNumberOfAttributes(relation);
> +   rel->max_attr = max_attrnum;
>     rel->reltablespace = RelationGetForm(relation)->reltablespace;
> 
> This breaks the fundamental assumption that rel->max_attr is equal to
> RelationGetNumberOfAttributes of that table.  My concern is: this
> change would probably be a wart, so it would be bug-prone in future
> versions.

Hmm. I believe that once RelOptInfo is created all attributes
defined in it is safely accessed. Is it a wrong assumption?
Actually RelationGetNumberOfAttributes is used in few distinct
places while planning.

expand_targetlist uses it to scan the source relation's nonjunk
attributes. get_rel_data_width uses it to scan width of
attributes in statistics. It fails to add junk's width but it
dones't harm so much.. build_physical_tlist is not used for
foreign relations. build_path_tlist creates a tlist without
proper resjunk flags but create_modifytable_plan immediately
fixes that.

If we don't accept the expanded tupdesc for base relations, the
another way I can find is transforming the foreign relation into
something another like a subquery, or allowing expansion of
attribute list of a base relation...

> Another thing on the new version:
> 
> @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root,
> RelOptInfo *rel)
>             relation = heap_open(rte->relid, NoLock);
> 
>             numattrs = RelationGetNumberOfAttributes(relation);
> +
> +           /*
> + * Foreign tables may have expanded with some junk columns. Punt
> +            * in the case.
...
> I think this would disable the optimization on projection in foreign
> scans, causing performance regression.

Well, in update/delete cases, create_plan_recurse on foreign scan
is called with CP_EXACT_TLIST in create_modifytable_plan so the
code path is not actually used. Just replacing the if clause with
Assert seems to change nothing. I'm not sure we will add junks in
other cases but it's not likely..

> > One arguable behavior change is about wholrow vars. Currently it
> > refferes local tuple with all columns but it is explicitly
> > fetched as ROW() after this patch applied. This could be fixed
> > but not just now.
> >
> > Part of 0004-:
> > -  Output: f1, ''::text, ctid, rem1.*
> > -  Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
> > +  Output: f1, ''::text, tableoid, ctid, rem1.*
> > + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1
> > FOR UPDATE
> 
> That would be also performance regression.  If we go in this
> direction, that should be fixed.

Agreed. Will consider sooner..

> > Since this uses fdw_scan_tlist so it is theoretically
> > back-patchable back to 9.6.
> 
> IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join
> pushdown infrastructure, so I think your patch can be back-patched to
> PG9.5, but I don't think that's enough; IIRC, this issue was
> introduced in PG9.3, so a solution for this should be back-patch-able
> to PG9.3, I think.

In the previous version, fdw_scan_tlist is used to hold only
additional (junk) columns. I think that we can get rid of the
variable by scanning the full tlist for junk columns. Apparently
it's differnt patch for such versions. I'm not sure how much it
is invasive for now but will consider.

> > Please find the attached three files.
> 
> Thanks for the patches!
> 
> > 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch
> >
> >   This should fail for unpatched postgres_fdw. (Just for demonstration)
> 
> +CREATE TABLE p1 (a int, b int);
> +CREATE TABLE c1 (LIKE p1) INHERITS (p1);
> +CREATE TABLE c2 (LIKE p1) INHERITS (p1);
> +CREATE FOREIGN TABLE fp1 (a int, b int)
> + SERVER loopback OPTIONS (table_name 'p1');
> +INSERT INTO c1 VALUES (0, 1);
> +INSERT INTO c2 VALUES (1, 1);
> +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS
> toiddiff, ctid, * FROM fp1;
> 
> Does it make sense to evaluate toiddiff?  I think that should always
> be 0.

Right. it is checking that the values are not those of remote
table oids. If it is always 0 and the problematic foreign update
succeeds, it is working correctly.

=======
> > 0003-Fix-of-foreign-update-bug-of-PgFDW.patch
> >
> >   Fix of postgres_fdw for this problem.
> 
> Sorry, I have not looked at it closely yet, but before that I'd like
> to discuss the direction we go in.  I'm not convinced that your
> approach is the right direction, so as promised, I wrote a patch using
> the Param-based approach, and compared the two approaches.  Attached
> is a WIP patch for that, which includes the 0003 patch.  I don't think
> there would be any warts as discussed above in the Param-based
> approach for now.  (That approach modifies the planner so that the
> targetrel's tlist would contain Params as well as Vars/PHVs, so
> actually, it breaks the planner assumption that a rel's tlist would
> only include Vars/PHVs, but I don't find any issues on that at least
> for now.  Will look into that in more detail.)  And I don't think
> there would be any concern about performance regression, either.
> Maybe I'm missing something, though.
> 
> What do you think about that?

Hmm. It is beyond my understanding. Great work (for me)!

I confirmed that a FOREIGN_PARAM_EXEC is evaluated and stored
into the parent node. For the mentioned Merge/Sort/ForeignScan
case, Sort node takes the parameter value via projection. I
didn't know PARAM_EXEC works that way. I consulted nodeNestLoop
but not fully understood.

So I think it works.  I still don't think expanded tupledesc is
not wart but this is smarter than that.  Addition to that, it
seems back-patchable. I must admit that yours is better.

> Note about the attached: I tried to invent a utility for
> generate_new_param like SS_make_initplan_output_param as mentioned in
> [1], but since the FDW API doesn't pass PlannerInfo to the FDW, I
> think the FDW can't call the utility the same way.  Instead, I
> modified the planner so that 1) the FDW adds Params without setting
> PARAM_EXEC Param IDs using a new function, and then 2) the core fixes
> the IDs.

Agreed on not having PlannerInfo. I'll re-study this.  Some
comments on this right now are the follows.

It seems reserving the name remotetableoid, which doen't seem to
be used by users but not never.

Maybe paramid space of FOREIGN_PARAM is not necessarily be the
same with ordinary params that needs signalling aid.


> Sorry for the delay.
> 
> Best regards,
> Etsuro Fujita
> 
> [1]
> https://www.postgresql.org/message-id/3919.1527775582%40sss.pgh.pa.us

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/09/18 21:14), Kyotaro HORIGUCHI wrote:
> At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote
in<5B9BB133.1060107@lab.ntt.co.jp>
>> @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid
>> relationObjectId,\
>>   bool inhparent,
>>                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                   errmsg("cannot access temporary or unlogged relations during
>>                   r\
>> ecovery")));
>>
>> +   max_attrnum = RelationGetNumberOfAttributes(relation);
>> +
>> + /* Foreign table may have exanded this relation with junk columns */
>> + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
>> +   {
>> + AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
>> +       if (max_attrnum<  maxattno)
>> +           max_attrnum = maxattno;
>> +   }
>> +
>>      rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
>> -   rel->max_attr = RelationGetNumberOfAttributes(relation);
>> +   rel->max_attr = max_attrnum;
>>      rel->reltablespace = RelationGetForm(relation)->reltablespace;
>>
>> This breaks the fundamental assumption that rel->max_attr is equal to
>> RelationGetNumberOfAttributes of that table.  My concern is: this
>> change would probably be a wart, so it would be bug-prone in future
>> versions.
>
> Hmm. I believe that once RelOptInfo is created all attributes
> defined in it is safely accessed. Is it a wrong assumption?

The patch you proposed seems to fix the issue well for the current 
version of PG, but I'm a bit scared to have such an assumption (ie, to 
include columns in a rel's tlist that are not defined anywhere in the 
system catalogs).  In future we might add eg, a lsyscache.c routine for 
some planning use that are given the attr number of a column as an 
argument, like get_attavgwidth, and if so, it would be easily 
conceivable that that routine would error out for such an undefined 
column.  (get_attavgwidth would return 0, not erroring out, though.)

> Actually RelationGetNumberOfAttributes is used in few distinct
> places while planning.

> build_physical_tlist is not used for
> foreign relations.

For UPDATE/DELETE, that function would not be called for a foreign 
target in the posetgres_fdw case, as CTID is requested (see 
use_physical_tlist), but otherwise that function may be called if 
possible.  No?

> If we don't accept the expanded tupdesc for base relations, the
> another way I can find is transforming the foreign relation into
> something another like a subquery, or allowing expansion of
> attribute list of a base relation...

Sorry, I don't understand this fully, but there seems to be the same 
concern as mentioned above.

>> Another thing on the new version:
>>
>> @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root,
>> RelOptInfo *rel)
>>              relation = heap_open(rte->relid, NoLock);
>>
>>              numattrs = RelationGetNumberOfAttributes(relation);
>> +
>> +           /*
>> + * Foreign tables may have expanded with some junk columns. Punt
>> +            * in the case.
> ...
>> I think this would disable the optimization on projection in foreign
>> scans, causing performance regression.
>
> Well, in update/delete cases, create_plan_recurse on foreign scan
> is called with CP_EXACT_TLIST in create_modifytable_plan

That's not necessarily true; consider UPDATE/DELETE on a local join; in 
that case the topmost plan node for a subplan of a ModifyTable would be 
a join, and if that's a NestLoop, create_plan_recurse would call 
create_nestloop_plan, which would recursively call create_plan_recurse 
for its inner/outer subplans with flag=0, not CP_EXACT_TLIST.

> so the
> code path is not actually used.

I think this is true for the postgres_fdw case; because 
use_physical_tlist would decide not to do build_physical_tlist for the 
reason mentioned above.  BUT my question here is: why do we need the 
change to build_physical_tlist?

>>> Since this uses fdw_scan_tlist so it is theoretically
>>> back-patchable back to 9.6.
>>
>> IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join
>> pushdown infrastructure, so I think your patch can be back-patched to
>> PG9.5, but I don't think that's enough; IIRC, this issue was
>> introduced in PG9.3, so a solution for this should be back-patch-able
>> to PG9.3, I think.
>
> In the previous version, fdw_scan_tlist is used to hold only
> additional (junk) columns. I think that we can get rid of the
> variable by scanning the full tlist for junk columns. Apparently
> it's differnt patch for such versions. I'm not sure how much it
> is invasive for now but will consider.

Sorry, I don't fully understand this.  Could you elaborate a bit more on 
this?

>>> 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch
>>>
>>>    This should fail for unpatched postgres_fdw. (Just for demonstration)
>>
>> +CREATE TABLE p1 (a int, b int);
>> +CREATE TABLE c1 (LIKE p1) INHERITS (p1);
>> +CREATE TABLE c2 (LIKE p1) INHERITS (p1);
>> +CREATE FOREIGN TABLE fp1 (a int, b int)
>> + SERVER loopback OPTIONS (table_name 'p1');
>> +INSERT INTO c1 VALUES (0, 1);
>> +INSERT INTO c2 VALUES (1, 1);
>> +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS
>> toiddiff, ctid, * FROM fp1;
>>
>> Does it make sense to evaluate toiddiff?  I think that should always
>> be 0.
>
> Right. it is checking that the values are not those of remote
> table oids.

Sorry, my explanation was not enough, but that seems to me more 
complicated than necessary.  How about just evaluating 
tableoid::regclass, instead of toiddiff?

> =======
>>> 0003-Fix-of-foreign-update-bug-of-PgFDW.patch
>>>
>>>    Fix of postgres_fdw for this problem.
>>
>> Sorry, I have not looked at it closely yet, but before that I'd like
>> to discuss the direction we go in.  I'm not convinced that your
>> approach is the right direction, so as promised, I wrote a patch using
>> the Param-based approach, and compared the two approaches.  Attached
>> is a WIP patch for that, which includes the 0003 patch.  I don't think
>> there would be any warts as discussed above in the Param-based
>> approach for now.  (That approach modifies the planner so that the
>> targetrel's tlist would contain Params as well as Vars/PHVs, so
>> actually, it breaks the planner assumption that a rel's tlist would
>> only include Vars/PHVs, but I don't find any issues on that at least
>> for now.  Will look into that in more detail.)  And I don't think
>> there would be any concern about performance regression, either.
>> Maybe I'm missing something, though.
>>
>> What do you think about that?
>
> Hmm. It is beyond my understanding. Great work (for me)!

I just implemented Tom's idea.  I hope I did that correctly.

> I confirmed that a FOREIGN_PARAM_EXEC is evaluated and stored
> into the parent node. For the mentioned Merge/Sort/ForeignScan
> case, Sort node takes the parameter value via projection. I
> didn't know PARAM_EXEC works that way. I consulted nodeNestLoop
> but not fully understood.
>
> So I think it works.  I still don't think expanded tupledesc is
> not wart but this is smarter than that.  Addition to that, it
> seems back-patchable. I must admit that yours is better.

As mentioned above, I'm a bit scared of the idea that we include columns 
not defined anywhere in the system catalogs in a rel's tlist.  For the 
reason mentioned above, I think we should avoid such a thing, IMO.

>> Note about the attached: I tried to invent a utility for
>> generate_new_param like SS_make_initplan_output_param as mentioned in
>> [1], but since the FDW API doesn't pass PlannerInfo to the FDW, I
>> think the FDW can't call the utility the same way.  Instead, I
>> modified the planner so that 1) the FDW adds Params without setting
>> PARAM_EXEC Param IDs using a new function, and then 2) the core fixes
>> the IDs.
>
> Agreed on not having PlannerInfo. I'll re-study this.  Some
> comments on this right now are the follows.

Thanks for the comments!

> It seems reserving the name remotetableoid, which doen't seem to
> be used by users but not never.

This has also been suggested by Tom [2].

> Maybe paramid space of FOREIGN_PARAM is not necessarily be the
> same with ordinary params that needs signalling aid.

Yeah, but I modified the planner so that it can distinguish one from the 
other; because I think it's better to avoid unneeded SS_finalize_plan 
processing when only generating foreign Params, and/or minimize the cost 
in set_plan_references by only converting foreign Params into simple 
Vars using search_indexed_tlist_for_non_var, which are both expensive.

One thing I noticed is: in any approach, I think use_physical_tlist 
needs to be modified so that it disables doing build_physical_tlist for 
a foreign scan in the case where the FDW added resjunk columns for 
UPDATE/DELETE that are different from user/system columns of the foreign 
table; else such columns would not be emitted from the foreign scan.

Best regards,
Etsuro Fujita

[2] https://www.postgresql.org/message-id/8627.1526591849%40sss.pgh.pa.us


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/09/21 20:03), Etsuro Fujita wrote:
> (2018/09/18 21:14), Kyotaro HORIGUCHI wrote:
>> At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro
>> Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
>> in<5B9BB133.1060107@lab.ntt.co.jp>
>>> I wrote a patch using
>>> the Param-based approach, and compared the two approaches.

>>> I don't think
>>> there would be any warts as discussed above in the Param-based
>>> approach for now. (That approach modifies the planner so that the
>>> targetrel's tlist would contain Params as well as Vars/PHVs, so
>>> actually, it breaks the planner assumption that a rel's tlist would
>>> only include Vars/PHVs, but I don't find any issues on that at least
>>> for now. Will look into that in more detail.)

I spent quite a bit of time looking into that, but I couldn't find any 
issues, including ones discussed in [1]:

* In contrib/postgres_fdw, the patch does the special handling of the 
Param representing the remote table OID in deparsing a remote SELECT 
query and building fdw_scan_tlist, but it wouldn't need the 
pull_var_clause change as proposed in [1].  And ISTM that that handling 
would be sufficient to avoid errors like 'variable not found in subplan 
target lists' as in [1].

* Params as extra target expressions can never be used as Pathkeys or 
something like that, so it seems unlikely that that approach would cause 
'could not find pathkey item to sort' errors in 
prepare_sort_from_pathkeys() as in [1].

* I checked other parts of the planner such as subselect.c and 
setrefs.c, but I couldn't find any issues.

>>> What do you think about that?

>> I confirmed that a FOREIGN_PARAM_EXEC is evaluated and stored
>> into the parent node. For the mentioned Merge/Sort/ForeignScan
>> case, Sort node takes the parameter value via projection. I
>> didn't know PARAM_EXEC works that way. I consulted nodeNestLoop
>> but not fully understood.
>>
>> So I think it works. I still don't think expanded tupledesc is
>> not wart but this is smarter than that. Addition to that, it
>> seems back-patchable. I must admit that yours is better.

I also think that approach would be back-patchable to PG9.3, where 
contrib/postgres_fdw landed with the writable functionality, so I'm 
inclined to vote for the Param-based approach.  Attached is an updated 
version of the patch.  Changes:

* Added this to use_physical_tlist():

> One thing I noticed is: in any approach, I think use_physical_tlist
> needs to be modified so that it disables doing build_physical_tlist for
> a foreign scan in the case where the FDW added resjunk columns for
> UPDATE/DELETE that are different from user/system columns of the foreign
> table; else such columns would not be emitted from the foreign scan.

* Fixed a bug in conversion_error_callback() in contrib/postgres_fdw.c

* Simplified your contrib/postgres_fdw.c tests as discussed

* Revise code/comments a bit

* Added docs to fdwhandler.sgml

* Rebased the patch against the latest HEAD

Best regards,
Etsuro Fujita

[1] 
https://www.postgresql.org/message-id/flat/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com

Вложения

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2018/10/02 21:16), Etsuro Fujita wrote:
> Attached is an updated
> version of the patch. Changes:

That patch conflicts the recent executor changes, so I'm attaching a 
rebased patch, in which I also added a fast path to 
add_params_to_result_rel and did some comment editing for consistency.

I'll add this to the next CF so that it does not get lost.

Best regards,
Etsuro Fujita

Вложения

Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

От
Tom Lane
Дата:
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> [ fix-foreign-modify-efujita-2.patch ]

Um ... wow, I do not like anything about this.  Adding a "tableoid = X"
constraint to every remote update query seems awfully expensive,
considering that (a) it's useless for non-partitioned tables, and
(b) the remote planner will have exactly no intelligence about handling
it.  We could improve (b) probably, but that'd be another big chunk of
work, and it wouldn't help when talking to older servers.

(Admittedly, I'm not sure I have a better idea.  If we knew which
remote tables were partitioned, we could avoid sending unnecessary
tableoid constraints; but we don't have any good way to track that.)

I think the proposed hacks on the planner's Param handling are a
mess as well.  You can't go and change the contents of a Param node
sometime after creating it --- that will for example break equal()
comparisons that might be done in between.  (No, I don't buy that
you know exactly what will be done in between.)  The cost of what
you've added to join tlist creation and setrefs processing seems
unduly high, too.

I wonder whether we'd be better off thinking of a way to let FDWs
invent additional system column IDs for their tables, so that
something like a remote table OID could be represented in the
natural way as a Var with negative varattno.  This'd potentially
also be a win for FDWs whose underlying storage has a row identifier,
but it's not of type "tid".  Instead of trying to shoehorn their
row ID into SelfItemPointerAttributeNumber, they could define a
new system column that has a more appropriate data type.  Admittedly
there'd be some infrastructure work to do to make this happen, maybe
a lot of it; but it's a bullet we really need to bite at some point.

            regards, tom lane


Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Michael Paquier
Дата:
On Fri, Nov 16, 2018 at 01:35:15PM -0500, Tom Lane wrote:
> I wonder whether we'd be better off thinking of a way to let FDWs
> invent additional system column IDs for their tables, so that
> something like a remote table OID could be represented in the
> natural way as a Var with negative varattno.  This'd potentially
> also be a win for FDWs whose underlying storage has a row identifier,
> but it's not of type "tid".  Instead of trying to shoehorn their
> row ID into SelfItemPointerAttributeNumber, they could define a
> new system column that has a more appropriate data type.  Admittedly
> there'd be some infrastructure work to do to make this happen, maybe
> a lot of it; but it's a bullet we really need to bite at some point.

This patch got zero input for the last couple of months.  As it is
classified as bug fix, I have moved it to next CF, waiting on author.
Fujita-san, are you planning to look at it?
--
Michael

Вложения

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2019/02/02 10:21), Michael Paquier wrote:
> On Fri, Nov 16, 2018 at 01:35:15PM -0500, Tom Lane wrote:
>> I wonder whether we'd be better off thinking of a way to let FDWs
>> invent additional system column IDs for their tables, so that
>> something like a remote table OID could be represented in the
>> natural way as a Var with negative varattno.  This'd potentially
>> also be a win for FDWs whose underlying storage has a row identifier,
>> but it's not of type "tid".  Instead of trying to shoehorn their
>> row ID into SelfItemPointerAttributeNumber, they could define a
>> new system column that has a more appropriate data type.  Admittedly
>> there'd be some infrastructure work to do to make this happen, maybe
>> a lot of it; but it's a bullet we really need to bite at some point.
>
> This patch got zero input for the last couple of months.  As it is
> classified as bug fix, I have moved it to next CF, waiting on author.
> Fujita-san, are you planning to look at it?

I 100% agree with Tom, and actually, I tried to address his comments, 
but I haven't come up with a clear solution for that yet.  I really want 
to address this, but I won't have much time to work on that at least 
until after this development cycle, so what I'm thinking is to mark this 
as Returned with feedback, or if possible, to move this to the 2019-07 CF.

My apologies for the late reply.

Best regards,
Etsuro Fujita



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Michael Paquier
Дата:
On Thu, Feb 07, 2019 at 09:55:18PM +0900, Etsuro Fujita wrote:
> I 100% agree with Tom, and actually, I tried to address his comments, but I
> haven't come up with a clear solution for that yet.  I really want to
> address this, but I won't have much time to work on that at least until
> after this development cycle, so what I'm thinking is to mark this as
> Returned with feedback, or if possible, to move this to the 2019-07 CF.

Simply marking it as returned with feedback does not seem adapted to
me as we may lose track of it.  Moving it to the future CF would make
more sense in my opinion.
--
Michael

Вложения

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
(2019/02/08 10:09), Michael Paquier wrote:
> On Thu, Feb 07, 2019 at 09:55:18PM +0900, Etsuro Fujita wrote:
>> I 100% agree with Tom, and actually, I tried to address his comments, but I
>> haven't come up with a clear solution for that yet.  I really want to
>> address this, but I won't have much time to work on that at least until
>> after this development cycle, so what I'm thinking is to mark this as
>> Returned with feedback, or if possible, to move this to the 2019-07 CF.
>
> Simply marking it as returned with feedback does not seem adapted to
> me as we may lose track of it.  Moving it to the future CF would make
> more sense in my opinion.

OK, I have moved this to the 2019-07 CF, keeping Waiting on Author.

Best regards,
Etsuro Fujita



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Alvaro Herrera
Дата:
On 2018-Nov-16, Tom Lane wrote:

> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> > [ fix-foreign-modify-efujita-2.patch ]
> 
> Um ... wow, I do not like anything about this.  Adding a "tableoid = X"
> constraint to every remote update query seems awfully expensive,
> considering that (a) it's useless for non-partitioned tables, and
> (b) the remote planner will have exactly no intelligence about handling
> it.  We could improve (b) probably, but that'd be another big chunk of
> work, and it wouldn't help when talking to older servers.

So do we have an updated patch for this?  It's been a while since this
patch saw any movement ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Michael Paquier
Дата:
On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote:
> So do we have an updated patch for this?  It's been a while since this
> patch saw any movement ...

Please note that this involves a couple of people in Japan, and this
week is the Obon vacation season for a lot of people.  So there could
be delays in replies.
--
Michael

Вложения

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Alvaro Herrera
Дата:
On 2019-Aug-13, Michael Paquier wrote:

> On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote:
> > So do we have an updated patch for this?  It's been a while since this
> > patch saw any movement ...
> 
> Please note that this involves a couple of people in Japan, and this
> week is the Obon vacation season for a lot of people.  So there could
> be delays in replies.

Understood, thanks for the info.  We still have two weeks to the start
of commitfest anyway.  And since it's been sleeping since November 2018,
I guess we can wait a little bit yet.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
Hi Alvaro and Michael,

On Tue, Aug 13, 2019 at 11:04 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> On 2019-Aug-13, Michael Paquier wrote:
> > On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote:
> > > So do we have an updated patch for this?  It's been a while since this
> > > patch saw any movement ...

Thanks for reminding me about this, Alvaro!

> > Please note that this involves a couple of people in Japan, and this
> > week is the Obon vacation season for a lot of people.  So there could
> > be delays in replies.

Yeah, I was on that vacation.  Thanks, Michael!

> Understood, thanks for the info.  We still have two weeks to the start
> of commitfest anyway.  And since it's been sleeping since November 2018,
> I guess we can wait a little bit yet.

This is my TODO item for PG13, but I'll give priority to other things
in the next commitfest.  If anyone wants to work on it, feel free;
else I'll move this to the November commitfest when it opens.

Best regards,
Etsuro Fujita



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> This is my TODO item for PG13, but I'll give priority to other things
> in the next commitfest.  If anyone wants to work on it, feel free;
> else I'll move this to the November commitfest when it opens.

Moved.

Best regards,
Etsuro Fujita



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Michael Paquier
Дата:
On Tue, Sep 03, 2019 at 12:37:52AM +0900, Etsuro Fujita wrote:
> On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> This is my TODO item for PG13, but I'll give priority to other things
>> in the next commitfest.  If anyone wants to work on it, feel free;
>> else I'll move this to the November commitfest when it opens.
>
> Moved.

This has been waiting on author for two commit fests now, and the
situation has not changed.  Fujita-san, Hiriguchi-san, do you have an
update to provide?  There is no meaning to keep the current stale
situation for more CFs.
--
Michael

Вложения

Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
Hi Michael-san,

On Mon, Nov 25, 2019 at 4:13 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Tue, Sep 03, 2019 at 12:37:52AM +0900, Etsuro Fujita wrote:
> > On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> >> This is my TODO item for PG13, but I'll give priority to other things
> >> in the next commitfest.  If anyone wants to work on it, feel free;
> >> else I'll move this to the November commitfest when it opens.
> >
> > Moved.
>
> This has been waiting on author for two commit fests now, and the
> situation has not changed.  Fujita-san, Hiriguchi-san, do you have an
> update to provide?  There is no meaning to keep the current stale
> situation for more CFs.

I was planning to work on this in this commitfest, but sorry, I didn't
have time due to other priorities.  Probably, I won't have time for
this in the development cycle for v13.  So I'll mark this as RWF,
unless anyone wants to work on it.

Best regards,
Etsuro Fujita



Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server

От
Etsuro Fujita
Дата:
On Tue, Nov 26, 2019 at 12:37 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> I was planning to work on this in this commitfest, but sorry, I didn't
> have time due to other priorities.  Probably, I won't have time for
> this in the development cycle for v13.  So I'll mark this as RWF,
> unless anyone wants to work on it.

Done.

Best regards,
Etsuro Fujita