Обсуждение: Odd system-column handling in postgres_fdw join pushdown patch

Поиск
Список
Период
Сортировка

Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
Hi,

I noticed that the postgres_fdw join pushdown patch retrieves system
columns other than ctid (and oid) from the remote server as shown in the
example:

postgres=# explain verbose select foo.tableoid, foo.xmin, foo.cmin,
foo.xmax, foo.cmax, foo.* from foo, bar where foo.a = bar.a;

   QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------
--------
 Foreign Scan  (cost=100.00..102.09 rows=2 width=28)
   Output: foo.tableoid, foo.xmin, foo.cmin, foo.xmax, foo.cmax, foo.a,
foo.b
   Relations: (public.foo) INNER JOIN (public.bar)
   Remote SQL: SELECT r1.tableoid, r1.xmin, r1.cmin, r1.xmax, r1.cmax,
r1.a, r1.b FROM (public.foo r1 INNER JOIN public.bar r2 ON (TRUE)) WHERE
((r1.a =
 r2.a))
(4 rows)

BUT: we don't make any effort to ensure that local and remote values
match, so system columns other than ctid and oid should not be retrieved
from the remote server.  So, I'd like to propose: (1) when tableoids are
requested from the remote server, postgres_fdw sets valid values for
them locally, instead (core should support that?) and (2) when any of
xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
pushing down foreign joins.  (We might be able to set appropriate values
for them locally the same way as for tableoids, but I'm not sure it's
worth complicating the code.)  I think that would be probably OK,
because users wouldn't retrieve any such columns in practice.

Attached is a proposed patch for that.

Best regards,
Etsuro Fujita

Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Thu, Mar 17, 2016 at 4:30 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi,

I noticed that the postgres_fdw join pushdown patch retrieves system
columns other than ctid (and oid) from the remote server as shown in the
example:

postgres=# explain verbose select foo.tableoid, foo.xmin, foo.cmin,
foo.xmax, foo.cmax, foo.* from foo, bar where foo.a = bar.a;

   QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
--------
 Foreign Scan  (cost=100.00..102.09 rows=2 width=28)
   Output: foo.tableoid, foo.xmin, foo.cmin, foo.xmax, foo.cmax, foo.a,
foo.b
   Relations: (public.foo) INNER JOIN (public.bar)
   Remote SQL: SELECT r1.tableoid, r1.xmin, r1.cmin, r1.xmax, r1.cmax,
r1.a, r1.b FROM (public.foo r1 INNER JOIN public.bar r2 ON (TRUE)) WHERE
((r1.a =
 r2.a))
(4 rows)

Thanks for catching the bug and producing a patch.
 

BUT: we don't make any effort to ensure that local and remote values
match, so system columns other than ctid and oid should not be retrieved
from the remote server.  So, I'd like to propose: (1) when tableoids are
requested from the remote server, postgres_fdw sets valid values for
them locally, instead (core should support that?) and

If we are disabling join pushdown when the targetlist has other system columns, shouldn't we treat tableoid in the same fashion. We should disable join pushdown when tableoid is requested?

I agree that we might want to do this in core instead of FDW specific core. That way we avoid each FDW implementing its own solution. Ultimately, all that needs to be done to push OID of the foreign table in place of tableoid column. The core code can do that. It already does that for the base tables.
 
(2) when any of
xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
pushing down foreign joins.  (We might be able to set appropriate values
for them locally the same way as for tableoids, but I'm not sure it's
worth complicating the code.)  I think that would be probably OK,
because users wouldn't retrieve any such columns in practice.

In that patch you have set pushdown_safe to false for the base relation fetching system columns. But pushdown_safe = false means that that relation is not safe to push down. A base foreign relation is always safe to push down, so should have pushdown_safe = true always. Instead, I suggest having a separate boolean has_unshippable_syscols (or something with similar name) in PgFdwRelationInfo, which is set to true in such case. In foreign_join_ok, we return false (thus not pushing down the join), if any of the joining relation has that attribute set. By default this member is false.

Even for a base table those values are rather random, although they are not fetched from the foreign server. Instead of not pushing the join down, we should push the join down without fetching those attributes. While constructing the query, don't include these system attributes in SELECT clause and don't include corresponding positions in retrieved_attributes list. That means those attributes won't be set while fetching the row from the foreign server and will have garbage values in corresponding places. I guess that would work.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/17 22:15, Ashutosh Bapat wrote:
> On Thu, Mar 17, 2016 at 4:30 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>     BUT: we don't make any effort to ensure that local and remote values
>     match, so system columns other than ctid and oid should not be retrieved
>     from the remote server.  So, I'd like to propose: (1) when tableoids are
>     requested from the remote server, postgres_fdw sets valid values for
>     them locally, instead (core should support that?) and

> If we are disabling join pushdown when the targetlist has other system
> columns, shouldn't we treat tableoid in the same fashion. We should
> disable join pushdown when tableoid is requested?

That seems a bit too restrictive to me.

> I agree that we might want to do this in core instead of FDW specific
> core. That way we avoid each FDW implementing its own solution.
> Ultimately, all that needs to be done to push OID of the foreign table
> in place of tableoid column. The core code can do that. It already does
> that for the base tables.

OK, I'll try to modify the patch so that the core does that work.

>     (2) when any of
>     xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
>     pushing down foreign joins.  (We might be able to set appropriate values
>     for them locally the same way as for tableoids, but I'm not sure it's
>     worth complicating the code.)  I think that would be probably OK,
>     because users wouldn't retrieve any such columns in practice.

> In that patch you have set pushdown_safe to false for the base relation
> fetching system columns. But pushdown_safe = false means that that
> relation is not safe to push down. A base foreign relation is always
> safe to push down, so should have pushdown_safe = true always. Instead,
> I suggest having a separate boolean has_unshippable_syscols (or
> something with similar name) in PgFdwRelationInfo, which is set to true
> in such case. In foreign_join_ok, we return false (thus not pushing down
> the join), if any of the joining relation has that attribute set. By
> default this member is false.

Maybe I'm missing something, but why do you consider that base foreign 
tables need always be safe to push down?  IIUC, the pushdown_safe flag 
is used only for foreign_join_ok, so I think that a base foreign table 
needs not necessarily be safe to push down.

> Even for a base table those values are rather random, although they are
> not fetched from the foreign server. Instead of not pushing the join
> down, we should push the join down without fetching those attributes.
> While constructing the query, don't include these system attributes in
> SELECT clause and don't include corresponding positions in
> retrieved_attributes list. That means those attributes won't be set
> while fetching the row from the foreign server and will have garbage
> values in corresponding places. I guess that would work.

That might be an idea, but as I said above, users wouldn't specify any 
system columns other than tableoid and ctid (and oid) in their queries, 
in practice, so I'm not sure it's worth complicating the code.

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:



    (2) when any of
    xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
    pushing down foreign joins.  (We might be able to set appropriate values
    for them locally the same way as for tableoids, but I'm not sure it's
    worth complicating the code.)  I think that would be probably OK,
    because users wouldn't retrieve any such columns in practice.

In that patch you have set pushdown_safe to false for the base relation
fetching system columns. But pushdown_safe = false means that that
relation is not safe to push down. A base foreign relation is always
safe to push down, so should have pushdown_safe = true always. Instead,
I suggest having a separate boolean has_unshippable_syscols (or
something with similar name) in PgFdwRelationInfo, which is set to true
in such case. In foreign_join_ok, we return false (thus not pushing down
the join), if any of the joining relation has that attribute set. By
default this member is false.

Maybe I'm missing something, but why do you consider that base foreign tables need always be safe to push down?  IIUC, the pushdown_safe flag is used only for foreign_join_ok, so I think that a base foreign table needs not necessarily be safe to push down.


A base foreign table "always" fetches data from the foreign server, so it "has to be" always safe to push down. pushdown_safe flag is designated to tell whether the relation corresponding to PgFdwRelationInfo where this flag is set is safe to push down.Right now it's only used for joins but in future it would be used for any push down of higher operations. It seems very much possible after the upper pathification changes. We can not have a query sent to the foreign server for a relation, when pushdown_safe is false for that. Your patch does that for foreign base relation which try to fetch system columns.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Thu, Mar 17, 2016 at 7:00 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> BUT: we don't make any effort to ensure that local and remote values
> match, so system columns other than ctid and oid should not be retrieved
> from the remote server.

I agree.

> So, I'd like to propose: (1) when tableoids are
> requested from the remote server, postgres_fdw sets valid values for
> them locally, instead (core should support that?)

Sure.

> and (2) when any of
> xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
> pushing down foreign joins.  (We might be able to set appropriate values
> for them locally the same way as for tableoids, but I'm not sure it's
> worth complicating the code.)  I think that would be probably OK,
> because users wouldn't retrieve any such columns in practice.

Now that seems like the wrong reaction.  I mean, aren't these just
going to be 0 or something?  Refusing to push the join down seems
strange.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/19 4:51, Robert Haas wrote:
> On Thu, Mar 17, 2016 at 7:00 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> So, I'd like to propose: (1) when tableoids are
>> requested from the remote server, postgres_fdw sets valid values for
>> them locally, instead (core should support that?)

> Sure.

>> and (2) when any of
>> xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
>> pushing down foreign joins.  (We might be able to set appropriate values
>> for them locally the same way as for tableoids, but I'm not sure it's
>> worth complicating the code.)  I think that would be probably OK,
>> because users wouldn't retrieve any such columns in practice.

> Now that seems like the wrong reaction.  I mean, aren't these just
> going to be 0 or something?  Refusing to push the join down seems
> strange.

OK, I'll modify the patch so that the join is pushed down even if any of 
xmins, xmaxs, cmins, and cmaxs are requested.  Do you think which one 
should set values for these as well as tableoids, postgres_fdw or core?

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Tue, Mar 22, 2016 at 8:03 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/19 4:51, Robert Haas wrote:
On Thu, Mar 17, 2016 at 7:00 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
So, I'd like to propose: (1) when tableoids are
requested from the remote server, postgres_fdw sets valid values for
them locally, instead (core should support that?)

Sure.

and (2) when any of
xmins, xmaxs, cmins, and cmaxs are requested, postgres_fdw gives up
pushing down foreign joins.  (We might be able to set appropriate values
for them locally the same way as for tableoids, but I'm not sure it's
worth complicating the code.)  I think that would be probably OK,
because users wouldn't retrieve any such columns in practice.

Now that seems like the wrong reaction.  I mean, aren't these just
going to be 0 or something?  Refusing to push the join down seems
strange.

OK, I'll modify the patch so that the join is pushed down even if any of xmins, xmaxs, cmins, and cmaxs are requested.  Do you think which one should set values for these as well as tableoids, postgres_fdw or core?
 
Earlier in this mail chain, I suggested that the core should take care of storing the values for these columns. But instead, I think, core should provide functions which can be used by FDWs, if they want, to return values for those columns. Something like Datum get_syscol_value(RelOptInfo/Relation, attno). The function will return Datum 0 for most of the columns and table's OID for tableoid. My 0.02.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/22 14:54, Ashutosh Bapat wrote:
> On Tue, Mar 22, 2016 at 8:03 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
>     OK, I'll modify the patch so that the join is pushed down even if
>     any of xmins, xmaxs, cmins, and cmaxs are requested.  Do you think
>     which one should set values for these as well as tableoids,
>     postgres_fdw or core?

> Earlier in this mail chain, I suggested that the core should take care
> of storing the values for these columns. But instead, I think, core
> should provide functions which can be used by FDWs, if they want, to
> return values for those columns. Something like Datum
> get_syscol_value(RelOptInfo/Relation, attno). The function will return
> Datum 0 for most of the columns and table's OID for tableoid. My 0.02.

What I had in mind was (1) create_foreignscan_plan would create Lists 
from the ForeignScan's fdw_scan_tlist: (a) indexes/OID values of 
tableoids in fdw_scan_tlist, and (b) indexes of xids and cids in 
fdw_scan_tlist, and then (2) ForeignNext would set the OID values for 
the tableoid columns in the scan tuple, using the Lists (a), and 
appropriate values (0 or something) for the xid and cid columns in the 
scan tuple, using the List (b).

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Tue, Mar 22, 2016 at 5:05 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/22 14:54, Ashutosh Bapat wrote:
On Tue, Mar 22, 2016 at 8:03 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
    OK, I'll modify the patch so that the join is pushed down even if
    any of xmins, xmaxs, cmins, and cmaxs are requested.  Do you think
    which one should set values for these as well as tableoids,
    postgres_fdw or core?

Earlier in this mail chain, I suggested that the core should take care
of storing the values for these columns. But instead, I think, core
should provide functions which can be used by FDWs, if they want, to
return values for those columns. Something like Datum
get_syscol_value(RelOptInfo/Relation, attno). The function will return
Datum 0 for most of the columns and table's OID for tableoid. My 0.02.

What I had in mind was (1) create_foreignscan_plan would create Lists from the ForeignScan's fdw_scan_tlist: (a) indexes/OID values of tableoids in fdw_scan_tlist, and (b) indexes of xids and cids in fdw_scan_tlist, and then (2) ForeignNext would set the OID values for the tableoid columns in the scan tuple, using the Lists (a), and appropriate values (0 or something) for the xid and cid columns in the scan tuple, using the List (b).

Looks Ok to me, although, that way an FDW looses an ability to use its own values for those columns, in case it wants to. For example, while using postgres_fdw for sharding, it might help saving xmax, xmin, cmax, cmin from the foreign server and use them while communicating with the foreign server.
 

Best regards,
Etsuro Fujita





--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/22 21:10, Ashutosh Bapat wrote:
> On Tue, Mar 22, 2016 at 5:05 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
>     On 2016/03/22 14:54, Ashutosh Bapat wrote:
>         Earlier in this mail chain, I suggested that the core should
>         take care
>         of storing the values for these columns. But instead, I think, core
>         should provide functions which can be used by FDWs, if they want, to
>         return values for those columns. Something like Datum
>         get_syscol_value(RelOptInfo/Relation, attno). The function will
>         return
>         Datum 0 for most of the columns and table's OID for tableoid. My
>         0.02.

>     What I had in mind was (1) create_foreignscan_plan would create
>     Lists from the ForeignScan's fdw_scan_tlist: (a) indexes/OID values
>     of tableoids in fdw_scan_tlist, and (b) indexes of xids and cids in
>     fdw_scan_tlist, and then (2) ForeignNext would set the OID values
>     for the tableoid columns in the scan tuple, using the Lists (a), and
>     appropriate values (0 or something) for the xid and cid columns in
>     the scan tuple, using the List (b).

> Looks Ok to me, although, that way an FDW looses an ability to use its
> own values for those columns, in case it wants to. For example, while
> using postgres_fdw for sharding, it might help saving xmax, xmin, cmax,
> cmin from the foreign server and use them while communicating with the
> foreign server.

Yeah, it might be the case.

On second thoughts, I changed my mind; I think it'd be better for the 
FDW's to set values for tableoids, xids, and cids in the scan tuple. 
The reason other than your suggestion is because expressions in 
fdw_scan_tlist that contain such columns are not necessarily simple Vars 
and because such expressions might be evaluated more efficiently by the 
FDW than core.  We assume in postgres_fdw that expressions in 
fdw_scan_tlist are always simple Vars, though.

I'm not sure it's worth providing functions you suggested, because we 
can't assume that columns in the scan tuple are always simple Var 
columns, as I said above.

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Wed, Mar 23, 2016 at 8:20 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/22 21:10, Ashutosh Bapat wrote:
On Tue, Mar 22, 2016 at 5:05 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
    On 2016/03/22 14:54, Ashutosh Bapat wrote:
        Earlier in this mail chain, I suggested that the core should
        take care
        of storing the values for these columns. But instead, I think, core
        should provide functions which can be used by FDWs, if they want, to
        return values for those columns. Something like Datum
        get_syscol_value(RelOptInfo/Relation, attno). The function will
        return
        Datum 0 for most of the columns and table's OID for tableoid. My
        0.02.

    What I had in mind was (1) create_foreignscan_plan would create
    Lists from the ForeignScan's fdw_scan_tlist: (a) indexes/OID values
    of tableoids in fdw_scan_tlist, and (b) indexes of xids and cids in
    fdw_scan_tlist, and then (2) ForeignNext would set the OID values
    for the tableoid columns in the scan tuple, using the Lists (a), and
    appropriate values (0 or something) for the xid and cid columns in
    the scan tuple, using the List (b).

Looks Ok to me, although, that way an FDW looses an ability to use its
own values for those columns, in case it wants to. For example, while
using postgres_fdw for sharding, it might help saving xmax, xmin, cmax,
cmin from the foreign server and use them while communicating with the
foreign server.

Yeah, it might be the case.

On second thoughts, I changed my mind; I think it'd be better for the FDW's to set values for tableoids, xids, and cids in the scan tuple. The reason other than your suggestion is because expressions in fdw_scan_tlist that contain such columns are not necessarily simple Vars and because such expressions might be evaluated more efficiently by the FDW than core.  We assume in postgres_fdw that expressions in fdw_scan_tlist are always simple Vars, though.

I'm not sure it's worth providing functions you suggested, because we can't assume that columns in the scan tuple are always simple Var columns, as I said above.

 
An FDW can choose not to use those functions, so I don't see a connection between scan list having simple Vars and existence of those functions (actually a single one). But having those function would minimize the code that each FDW has to write, in case they want those functions. E.g. we have to translate Var::varno to tableoid in case that's requested by pulling RTE and then getting oid out from there. If that functionality is available in the core, 1. the code is not duplicated 2. every FDW will get the same tableoid. Similarly for the other columns.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/23 13:44, Ashutosh Bapat wrote:
> An FDW can choose not to use those functions, so I don't see a
> connection between scan list having simple Vars and existence of those
> functions (actually a single one). But having those function would
> minimize the code that each FDW has to write, in case they want those
> functions. E.g. we have to translate Var::varno to tableoid in case
> that's requested by pulling RTE and then getting oid out from there. If
> that functionality is available in the core, 1. the code is not
> duplicated 2. every FDW will get the same tableoid. Similarly for the
> other columns.

OK.  Then, I'd like to propose a function that would create interger 
Lists of indexes of tableoids, xids and cids plus an OID List of these 
tableoids, in a given fdw_scan_tlist, on the assumption that each 
expression in the fdw_scan_tlist is a simple Var.  I'd also like to 
propose another function that would fill system columns using these 
Lists when creating a scan tuple.

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Thu, Mar 24, 2016 at 9:31 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/23 13:44, Ashutosh Bapat wrote:
An FDW can choose not to use those functions, so I don't see a
connection between scan list having simple Vars and existence of those
functions (actually a single one). But having those function would
minimize the code that each FDW has to write, in case they want those
functions. E.g. we have to translate Var::varno to tableoid in case
that's requested by pulling RTE and then getting oid out from there. If
that functionality is available in the core, 1. the code is not
duplicated 2. every FDW will get the same tableoid. Similarly for the
other columns.

OK.  Then, I'd like to propose a function that would create interger Lists of indexes of tableoids, xids and cids plus

Ok,
 
an OID List of these tableoids,

I didn't get this.
 
in a given fdw_scan_tlist, on the assumption that each expression in the fdw_scan_tlist is a simple Var. 

I guess this is Ok. In fact, at least for now an expression involving any of those columns is not pushable to the foreign server, as the expression can not be evaluated there. So, if we come across such a case in further pushdowns, we will need to have a different solution for pushing down such target lists.
 
I'd also like to propose another function that would fill system columns using these Lists when creating a scan tuple.

Ok.

I had imagined that the code to extract the above lists and filling the values in scan tuple will be in FDW. We only provide a function to supply those values. But what you propose might actually be much practical.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:
A much simpler solution, that will work with postgres_fdw, might be to just deparse these columns with whatever random values (except for tableoid) they are expected to have in those places. Often these values can simply be NULL or 0. For tableoid deparse it to 'oid value'::oid. Thus for a user query

select t1.taleoid, t2.xmax, t1.c1, t2.c2 from t1 join t2 on (...) ... -- where t1 and t2 are foreign tables with same names on the foreign server.

the query sent to the foreign server would look like

select '15623'::oid, NULL, t1.c1, t2.c2 from t1 join t2 on (...) ... -- where '15623' is oid of t1 on local server.

This does spend more bandwidth than necessary and affect performance, here is why the approach might be better,
1. It's not very common to request these system columns in a "join" query involving foreign tables. Usually they will have user columns or ctid (DMLs) but very rarely other system columns.

2. This allows expressions involving these system columns to be pushed down, whenever we will start pushing them down in the targetlist.

3. The changes to the code are rather small. deparseColumnRef() will need to produce the strings above instead of actual column names.

4. The approach will work with slight change, if and when, we need the actual system column values from the foreign server. That time the above function needs to deparse the column names instead of constant values.

Having to hardcode tableoid at the time of planning should be fine since change in tableoid between planning and execution will trigger plan cache invalidation. I haven't tried this though.

Sorry for bringing this solution late to the table.

On Thu, Mar 24, 2016 at 3:04 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:


On Thu, Mar 24, 2016 at 9:31 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/23 13:44, Ashutosh Bapat wrote:
An FDW can choose not to use those functions, so I don't see a
connection between scan list having simple Vars and existence of those
functions (actually a single one). But having those function would
minimize the code that each FDW has to write, in case they want those
functions. E.g. we have to translate Var::varno to tableoid in case
that's requested by pulling RTE and then getting oid out from there. If
that functionality is available in the core, 1. the code is not
duplicated 2. every FDW will get the same tableoid. Similarly for the
other columns.

OK.  Then, I'd like to propose a function that would create interger Lists of indexes of tableoids, xids and cids plus

Ok,
 
an OID List of these tableoids,

I didn't get this.
 
in a given fdw_scan_tlist, on the assumption that each expression in the fdw_scan_tlist is a simple Var. 

I guess this is Ok. In fact, at least for now an expression involving any of those columns is not pushable to the foreign server, as the expression can not be evaluated there. So, if we come across such a case in further pushdowns, we will need to have a different solution for pushing down such target lists.
 
I'd also like to propose another function that would fill system columns using these Lists when creating a scan tuple.

Ok.

I had imagined that the code to extract the above lists and filling the values in scan tuple will be in FDW. We only provide a function to supply those values. But what you propose might actually be much practical.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/25 13:37, Ashutosh Bapat wrote:
> A much simpler solution, that will work with postgres_fdw, might be to
> just deparse these columns with whatever random values (except for
> tableoid) they are expected to have in those places. Often these values
> can simply be NULL or 0. For tableoid deparse it to 'oid value'::oid.
> Thus for a user query
>
> select t1.taleoid, t2.xmax, t1.c1, t2.c2 from t1 join t2 on (...) ... --
> where t1 and t2 are foreign tables with same names on the foreign server.
>
> the query sent to the foreign server would look like
>
> select '15623'::oid, NULL, t1.c1, t2.c2 from t1 join t2 on (...) ... --
> where '15623' is oid of t1 on local server.
>
> This does spend more bandwidth than necessary and affect performance,
> here is why the approach might be better,
> 1. It's not very common to request these system columns in a "join"
> query involving foreign tables. Usually they will have user columns or
> ctid (DMLs) but very rarely other system columns.

That may be true for now, but once we implement pair-wise join for two 
distributedly-partitioned tables in which we can push down pair-wise 
foreign joins, tableoid would be used in many cases, to identify child 
tables for rows to come from.

> 2. This allows expressions involving these system columns to be pushed
> down, whenever we will start pushing them down in the targetlist.
>
> 3. The changes to the code are rather small. deparseColumnRef() will
> need to produce the strings above instead of actual column names.
>
> 4. The approach will work with slight change, if and when, we need the
> actual system column values from the foreign server. That time the above
> function needs to deparse the column names instead of constant values.

As you pointed out, spending more bandwidth than necessary seems a bit 
inefficient.

The approach that we discussed would minimize the code for the FDW 
author to write, by providing the support functions you proposed.  I'll 
post a patch for that early next week.  (It would also minimize the 
patch to push down UPDATE/DELETE on a foreign join, proposed in [1], 
which has the same issue as for handling system columns in a RETURNING 
clause in such pushed-down UPDATE/DELETE.  So I'd like to propose that 
approach as a common functionality.)

> Sorry for bringing this solution late to the table.

No problem.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/56D57C4A.9000500@lab.ntt.co.jp





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/25 17:16, Etsuro Fujita wrote:
> The approach that we discussed would minimize the code for the FDW
> author to write, by providing the support functions you proposed.  I'll
> post a patch for that early next week.

I added two helper functions: GetFdwScanTupleExtraData and
FillFdwScanTupleSysAttrs.  The FDW author could use the former to get
info about system attributes other than ctids and oids in fdw_scan_tlist
during BeginForeignScan, and the latter to set values for these system
attributes during IterateForeignScan (InvalidTransactionId for
xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for
tableoids).  Attached is a proposed patch for that.  I also slightly
simplified the changes to make_tuple_from_result_row and
conversion_error_callback made by the postgres_fdw join pushdown patch.
  What do you think about that?

Best regards,
Etsuro Fujita

Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/03/29 15:37, Etsuro Fujita wrote:
> I added two helper functions: GetFdwScanTupleExtraData and
> FillFdwScanTupleSysAttrs.  The FDW author could use the former to get
> info about system attributes other than ctids and oids in fdw_scan_tlist
> during BeginForeignScan, and the latter to set values for these system
> attributes during IterateForeignScan (InvalidTransactionId for
> xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for
> tableoids).  Attached is a proposed patch for that.  I also slightly
> simplified the changes to make_tuple_from_result_row and
> conversion_error_callback made by the postgres_fdw join pushdown patch.
>   What do you think about that?

I revised comments a little bit.  Attached is an updated version of the
patch.  I think this issue should be fixed in advance of the PostgreSQL
9.6beta1 release.

Best regards,
Etsuro Fujita

Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:
With this patch, all instances of tableoid, cmin, cmax etc. will get a non-NULL value irrespective of whether they appear on nullable side of the join or not.

e.g. select t1.c1, t1.tableoid, t2.c1, t2.tableoid from ft4 t1 left join ft5 t2 on (t1.c1 = t2.c1); run in contrib_regression gives output
 c1  | tableoid | c1 | tableoid
-----+----------+----+----------
   2 |    54282 |    |    54285
   4 |    54282 |    |    54285
   6 |    54282 |  6 |    54285
   8 |    54282 |    |    54285
  10 |    54282 |    |    54285
  12 |    54282 | 12 |    54285

but the same query run on local tables select t1.c1, t1.tableoid, t2.c1, t2.tableoid from "S 1"."T 3" t1 left join "S 1"."T 4" t2 on (t1.c1 = t2.c1); gives output
 c1  | tableoid | c1 | tableoid
-----+----------+----+----------
   2 |    54258 |    |        
   4 |    54258 |    |        
   6 |    54258 |  6 |    54266
   8 |    54258 |    |        
  10 |    54258 |    |        
  12 |    54258 | 12 |    54266

BTW, why do we want to set the column values with invalid values, and not null? Wouldn't setting them NULL will be a better way?

On Tue, Apr 5, 2016 at 12:11 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/29 15:37, Etsuro Fujita wrote:
I added two helper functions: GetFdwScanTupleExtraData and
FillFdwScanTupleSysAttrs.  The FDW author could use the former to get
info about system attributes other than ctids and oids in fdw_scan_tlist
during BeginForeignScan, and the latter to set values for these system
attributes during IterateForeignScan (InvalidTransactionId for
xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for
tableoids).  Attached is a proposed patch for that.  I also slightly
simplified the changes to make_tuple_from_result_row and
conversion_error_callback made by the postgres_fdw join pushdown patch.
  What do you think about that?

I revised comments a little bit.  Attached is an updated version of the patch.  I think this issue should be fixed in advance of the PostgreSQL 9.6beta1 release.

Best regards,
Etsuro Fujita



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Noah Misch
Дата:
On Tue, Apr 05, 2016 at 03:41:00PM +0900, Etsuro Fujita wrote:
> On 2016/03/29 15:37, Etsuro Fujita wrote:
> >I added two helper functions: GetFdwScanTupleExtraData and
> >FillFdwScanTupleSysAttrs.  The FDW author could use the former to get
> >info about system attributes other than ctids and oids in fdw_scan_tlist
> >during BeginForeignScan, and the latter to set values for these system
> >attributes during IterateForeignScan (InvalidTransactionId for
> >xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for
> >tableoids).  Attached is a proposed patch for that.  I also slightly
> >simplified the changes to make_tuple_from_result_row and
> >conversion_error_callback made by the postgres_fdw join pushdown patch.
> >  What do you think about that?
> 
> I revised comments a little bit.  Attached is an updated version of the
> patch.  I think this issue should be fixed in advance of the PostgreSQL
> 9.6beta1 release.

Of the foreign table columns affected here, I bet only tableoid sees
non-negligible use.  Even tableoid is not very prominent, so I would not have
thought of this as a beta1 blocker.  What about this bug makes pre-beta1
resolution especially valuable?



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Noah Misch
Дата:
On Wed, Apr 06, 2016 at 01:14:34AM -0400, Noah Misch wrote:
> On Tue, Apr 05, 2016 at 03:41:00PM +0900, Etsuro Fujita wrote:
> > On 2016/03/29 15:37, Etsuro Fujita wrote:
> > >I added two helper functions: GetFdwScanTupleExtraData and
> > >FillFdwScanTupleSysAttrs.  The FDW author could use the former to get
> > >info about system attributes other than ctids and oids in fdw_scan_tlist
> > >during BeginForeignScan, and the latter to set values for these system
> > >attributes during IterateForeignScan (InvalidTransactionId for
> > >xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for
> > >tableoids).  Attached is a proposed patch for that.  I also slightly
> > >simplified the changes to make_tuple_from_result_row and
> > >conversion_error_callback made by the postgres_fdw join pushdown patch.
> > >  What do you think about that?
> > 
> > I revised comments a little bit.  Attached is an updated version of the
> > patch.  I think this issue should be fixed in advance of the PostgreSQL
> > 9.6beta1 release.
> 
> Of the foreign table columns affected here, I bet only tableoid sees
> non-negligible use.  Even tableoid is not very prominent, so I would not have
> thought of this as a beta1 blocker.  What about this bug makes pre-beta1
> resolution especially valuable?

This will probably get resolved earlier if it enters the process now as a non
beta blocker, compared to entering the process later as a beta blocker.  I'm
taking the liberty of doing that:

[This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this.  Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution.  Please
present, within 72 hours, a plan to fix the defect within seven days of this
message.  Thanks.



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Tue, Apr 5, 2016 at 4:54 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> With this patch, all instances of tableoid, cmin, cmax etc. will get a
> non-NULL value irrespective of whether they appear on nullable side of the
> join or not.
>
> e.g. select t1.c1, t1.tableoid, t2.c1, t2.tableoid from ft4 t1 left join ft5
> t2 on (t1.c1 = t2.c1); run in contrib_regression gives output
>  c1  | tableoid | c1 | tableoid
> -----+----------+----+----------
>    2 |    54282 |    |    54285
>    4 |    54282 |    |    54285
>    6 |    54282 |  6 |    54285
>    8 |    54282 |    |    54285
>   10 |    54282 |    |    54285
>   12 |    54282 | 12 |    54285
>
> but the same query run on local tables select t1.c1, t1.tableoid, t2.c1,
> t2.tableoid from "S 1"."T 3" t1 left join "S 1"."T 4" t2 on (t1.c1 = t2.c1);
> gives output
>  c1  | tableoid | c1 | tableoid
> -----+----------+----+----------
>    2 |    54258 |    |
>    4 |    54258 |    |
>    6 |    54258 |  6 |    54266
>    8 |    54258 |    |
>   10 |    54258 |    |
>   12 |    54258 | 12 |    54266
>
> BTW, why do we want to set the column values with invalid values, and not
> null? Wouldn't setting them NULL will be a better way?

I tend to favor zeroes rather than NULLs, because that's what we
typically use to represent an invalid value of those types, and I'm
not aware of any current case where those values are NULL.

Ashutosh's comment that "With this patch, all instances of tableoid,
cmin, cmax etc. will get a non-NULL value irrespective of whether they
appear on nullable side of the join or not." seems like something that
must be addressed before we can proceed here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I tend to favor zeroes rather than NULLs, because that's what we
> typically use to represent an invalid value of those types, and I'm
> not aware of any current case where those values are NULL.

Actually, come to think of it, what we *really* need to do here is
make sure that the behavior in the join-pushdown case matches the
behavior in the join-not-pushed-down case.

CREATE EXTENSION postgres_fdw;
CREATE SERVER s1 FOREIGN DATA WRAPPER postgres_fdw;
CREATE USER MAPPING FOR public SERVER s1;
CREATE TABLE t1 (a integer, b text);
CREATE FOREIGN TABLE ft1 (a integer, b text) SERVER s1 OPTIONS
(table_name 't1');
INSERT INTO t1 VALUES (1, 'foo'), (2, 'bar'), (3, 'baz'), (4, 'quux');

Without join pushdown - this is what gets selected by default, sadly,
so the costing isn't working as hoped in this case:

rhaas=# select ft1.xmax, ft2.xmax, ft1.* from ft1, ft1 ft2 where ft1.a = ft2.a;   xmax    |    xmax    | a |  b
------------+------------+---+------4294967295 | 4294967295 | 1 | foo4294967295 | 4294967295 | 2 | bar4294967295 |
4294967295| 3 | baz4294967295 | 4294967295 | 4 | quux
 
(4 rows)

With join pushdown, after disabling merge and hash joins:

rhaas=# select ft1.xmax, ft2.xmax, ft1.* from ft1, ft1 ft2 where ft1.a
= ft2.a;xmax | xmax | a |  b
------+------+---+------   0 |    0 | 1 | foo   0 |    0 | 2 | bar   0 |    0 | 3 | baz   0 |    0 | 4 | quux
(4 rows)

So, clearly that's not good.  It should at least be consistent.  But
more than that, the fact that postgres_fdw sets the xmax to 0xffffffff
is also pretty wacky.  We might use such a value as a sentinel for
some data type, but for transaction IDs that's just some random normal
transaction ID, and it's NOT coming from t1.  I haven't tracked down
where it *is* coming from yet, but can't imagine it's any place very
principled.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I tend to favor zeroes rather than NULLs, because that's what we
> typically use to represent an invalid value of those types, and I'm
> not aware of any current case where those values are NULL.

In fact, see heap_attisnull.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 13, 2016 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I tend to favor zeroes rather than NULLs, because that's what we
>> typically use to represent an invalid value of those types, and I'm
>> not aware of any current case where those values are NULL.

> In fact, see heap_attisnull.

Right, a table's system columns cannot be null at the table-scan level.
(But they could go to null above an outer join.)
        regards, tom lane



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 2:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, clearly that's not good.  It should at least be consistent.  But
> more than that, the fact that postgres_fdw sets the xmax to 0xffffffff
> is also pretty wacky.  We might use such a value as a sentinel for
> some data type, but for transaction IDs that's just some random normal
> transaction ID, and it's NOT coming from t1.  I haven't tracked down
> where it *is* coming from yet, but can't imagine it's any place very
> principled.

And, yeah, it's not very principled.

rhaas=# select ft1.xmin, ft1.xmax, ft1.cmin from ft1;xmin |    xmax    | cmin
------+------------+-------  96 | 4294967295 | 16392  96 | 4294967295 | 16392  96 | 4294967295 | 16392  96 | 4294967295
|16392
 
(4 rows)

What's happening here is that heap_getattr() is being applied to a
HeapTupleHeaderData which contains DatumTupleFields.  So 96 is
datum_len_, 4294967295 is the -1 recorded in datum_typmod, and 16392
is the compose type OID recorded in datum_typeid, which happens in
this case to be the OID of ft1.  Isn't that special?

It's hard for me to view this as anything other than a bug in
postgres_fdw - which of course means that this open item boils down to
the complaint that the way system columns are handled by join pushdown
isn't bug-compatible with the existing behavior....

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 2:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Apr 13, 2016 at 2:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> So, clearly that's not good.  It should at least be consistent.  But
>> more than that, the fact that postgres_fdw sets the xmax to 0xffffffff
>> is also pretty wacky.  We might use such a value as a sentinel for
>> some data type, but for transaction IDs that's just some random normal
>> transaction ID, and it's NOT coming from t1.  I haven't tracked down
>> where it *is* coming from yet, but can't imagine it's any place very
>> principled.
>
> And, yeah, it's not very principled.
>
> rhaas=# select ft1.xmin, ft1.xmax, ft1.cmin from ft1;
>  xmin |    xmax    | cmin
> ------+------------+-------
>    96 | 4294967295 | 16392
>    96 | 4294967295 | 16392
>    96 | 4294967295 | 16392
>    96 | 4294967295 | 16392
> (4 rows)
>
> What's happening here is that heap_getattr() is being applied to a
> HeapTupleHeaderData which contains DatumTupleFields.  So 96 is
> datum_len_, 4294967295 is the -1 recorded in datum_typmod, and 16392
> is the compose type OID recorded in datum_typeid, which happens in
> this case to be the OID of ft1.  Isn't that special?
>
> It's hard for me to view this as anything other than a bug in
> postgres_fdw - which of course means that this open item boils down to
> the complaint that the way system columns are handled by join pushdown
> isn't bug-compatible with the existing behavior....

OK, here's a patch.  What I did is:

1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
before returning it from postgres_fdw, so that we don't expose the
datum-tuple fields.   I can't see any reason this isn't safe, but I
might be missing something.

2. When a join is pushed down, deparse system columns using something
like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
column, which gets deparsed with the table OID in place of 0.  This
delivers the correct behavior in the presence of outer joins.

Review appreciated.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> 2. When a join is pushed down, deparse system columns using something
> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
> column, which gets deparsed with the table OID in place of 0.  This
> delivers the correct behavior in the presence of outer joins.

Um, why would that be necessary?  Surely the correct things will happen
on the far end without that, if it's implementing the same join semantics
as the local query would have.
        regards, tom lane



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 5:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> 2. When a join is pushed down, deparse system columns using something
>> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
>> column, which gets deparsed with the table OID in place of 0.  This
>> delivers the correct behavior in the presence of outer joins.
>
> Um, why would that be necessary?  Surely the correct things will happen
> on the far end without that, if it's implementing the same join semantics
> as the local query would have.

I don't know exactly what you mean by "without that".  Currently, the
situation is:

1. When postgres_fdw scans a baserel, the xmin, xmax, and cmin/cmax
fields reflect the reinterpreted contents of the datumtuple header
fields.  Blech.

2. When postgres_fdw scans a joinrel (the join is pushed down), any
references to xmin/xmax/cmin/cmax reflect the values of those fields
on the remote sides.

#1 is obviously stupid, although maybe not that stupid since nobody
cared enough to fix it before now.  #2 is arguably correct, but I
figure it's not a good idea to display the remote values of those
fields on the local system - local transaction IDs and remote
transaction IDs exist in two different XID spaces, and I think
shouldn't be conflated.  So what I'm proposing to do is standardize on
this:

When postgres_fdw scans a baserel *or* a joinrel, any references to
xmin/xmax/cmin/cmax are read as 0.

There are alternatives.  We could decide that the joinrel case (which,
BTW, is what the OP complained about) is right and the baserel case is
wrong, and change the baserel case to work like the joinrel case.  I'm
not enamored of that, but it's not totally nuts.  The one thing I'm
sure about is that the current baserel behavior is stupid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/04/14 4:57, Robert Haas wrote:
> On Wed, Apr 13, 2016 at 2:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Apr 13, 2016 at 2:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> So, clearly that's not good.  It should at least be consistent.  But
>>> more than that, the fact that postgres_fdw sets the xmax to 0xffffffff
>>> is also pretty wacky.  We might use such a value as a sentinel for
>>> some data type, but for transaction IDs that's just some random normal
>>> transaction ID, and it's NOT coming from t1.  I haven't tracked down
>>> where it *is* coming from yet, but can't imagine it's any place very
>>> principled.
>>
>> And, yeah, it's not very principled.
>>
>> rhaas=# select ft1.xmin, ft1.xmax, ft1.cmin from ft1;
>>   xmin |    xmax    | cmin
>> ------+------------+-------
>>     96 | 4294967295 | 16392
>>     96 | 4294967295 | 16392
>>     96 | 4294967295 | 16392
>>     96 | 4294967295 | 16392
>> (4 rows)
>>
>> What's happening here is that heap_getattr() is being applied to a
>> HeapTupleHeaderData which contains DatumTupleFields.  So 96 is
>> datum_len_, 4294967295 is the -1 recorded in datum_typmod, and 16392
>> is the compose type OID recorded in datum_typeid, which happens in
>> this case to be the OID of ft1.  Isn't that special?
>>
>> It's hard for me to view this as anything other than a bug in
>> postgres_fdw - which of course means that this open item boils down to
>> the complaint that the way system columns are handled by join pushdown
>> isn't bug-compatible with the existing behavior....

> OK, here's a patch.  What I did is:

Thank you for taking care of this.

> 1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
> before returning it from postgres_fdw, so that we don't expose the
> datum-tuple fields.   I can't see any reason this isn't safe, but I
> might be missing something.

I'm not sure that is really safe.

> 2. When a join is pushed down, deparse system columns using something
> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
> column, which gets deparsed with the table OID in place of 0.  This
> delivers the correct behavior in the presence of outer joins.

I think that that would cause useless data transfer for such culumns. 
Why not set values locally for such columns?

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/04/14 12:04, Etsuro Fujita wrote:
> On 2016/04/14 4:57, Robert Haas wrote:
>> 2. When a join is pushed down, deparse system columns using something
>> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
>> column, which gets deparsed with the table OID in place of 0.  This
>> delivers the correct behavior in the presence of outer joins.

> I think that that would cause useless data transfer for such culumns.
> Why not set values locally for such columns?

At least for the table OID.

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Wed, Apr 13, 2016 at 11:21 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
>>> 2. When a join is pushed down, deparse system columns using something
>>> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
>>> column, which gets deparsed with the table OID in place of 0.  This
>>> delivers the correct behavior in the presence of outer joins.
>> I think that that would cause useless data transfer for such culumns.
>> Why not set values locally for such columns?

Because that doesn't work properly when there are outer joins
involved.  I see no other way of doing that correctly that's anywhere
near as simple as this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:
Thanks Robert for the patch.



OK, here's a patch.  What I did is:

1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
before returning it from postgres_fdw, so that we don't expose the
datum-tuple fields.   I can't see any reason this isn't safe, but I
might be missing something.

2. When a join is pushed down, deparse system columns using something
like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
column, which gets deparsed with the table OID in place of 0.  This
delivers the correct behavior in the presence of outer joins.

Review appreciated.

It looks good to me. It might be good to explain why we don't add CASE .. END statement when qualify_col is false, i.e. "qualify_col being false indicates that there is only one relation involved thus no join."

BTW, I noticed that we are deparsing whole-row reference as ROW(list of columns from local definition of foreign table), which has the same problem with outer joins. It won't be NULL when the rest of the row from that relation is NULL in an outer join. It too needs to be encapsulated in CASE WHEN .. END expression. PFA patch with that fix included and also some testcases for system columns as well as whole-row references.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/04/14 13:04, Robert Haas wrote:
> On Wed, Apr 13, 2016 at 11:21 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>>>> 2. When a join is pushed down, deparse system columns using something
>>>> like "CASE WHEN r1.* IS NOT NULL THEN 0 END", except for the table OID
>>>> column, which gets deparsed with the table OID in place of 0.  This
>>>> delivers the correct behavior in the presence of outer joins.
>>> I think that that would cause useless data transfer for such culumns.
>>> Why not set values locally for such columns?

> Because that doesn't work properly when there are outer joins
> involved.

Understood.  It looks like I overlooked Ashutosh's mail about that. 
Thanks, Robert and Ashutosh!

Best regards,
Etsuro Fujita





Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Thu, Apr 14, 2016 at 7:49 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> BTW, I noticed that we are deparsing whole-row reference as ROW(list of
> columns from local definition of foreign table), which has the same problem
> with outer joins. It won't be NULL when the rest of the row from that
> relation is NULL in an outer join. It too needs to be encapsulated in CASE
> WHEN .. END expression. PFA patch with that fix included and also some
> testcases for system columns as well as whole-row references.

Good catch.  But your test cases are no good because then we have OIDs
hardcoded in the expected output.  That means 'make installcheck' will
fail, or if for any other reason the OID varies it will also fail.
Committed your version with those test cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Fri, Apr 15, 2016 at 9:39 PM, Robert
Haas<span dir="ltr"><<a href="mailto:robertmhaas@gmail.com" target="_blank">robertmhaas@gmail.com</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">OnThu, Apr 14, 2016 at 7:49 AM, Ashutosh Bapat<br /> <<a
href="mailto:ashutosh.bapat@enterprisedb.com">ashutosh.bapat@enterprisedb.com</a>>wrote:<br /> > BTW, I noticed
thatwe are deparsing whole-row reference as ROW(list of<br /> > columns from local definition of foreign table),
whichhas the same problem<br /> > with outer joins. It won't be NULL when the rest of the row from that<br /> >
relationis NULL in an outer join. It too needs to be encapsulated in CASE<br /> > WHEN .. END expression. PFA patch
withthat fix included and also some<br /> > testcases for system columns as well as whole-row references.<br /><br
/></span>Goodcatch.  But your test cases are no good because then we have OIDs<br /> hardcoded in the expected output. 
Thatmeans 'make installcheck' will<br /> fail, or if for any other reason the OID varies it will also fail.<br />
Committedyour version with those test cases.<br /><div class="HOEnZb"><div class="h5"><br
/></div></div></blockquote></div><br/></div><div class="gmail_extra">The testcases had tableoid::regclass which outputs
theforeign table's local name, which won't change across runs. Isn't that so?<br clear="all" /><br /></div><div
class="gmail_extra">--<br /><div class="gmail_signature"><div dir="ltr">Best Wishes,<br />Ashutosh Bapat<br
/>EnterpriseDBCorporation<br />The Postgres Database Company<br /></div></div></div></div> 

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Fri, Apr 15, 2016 at 12:16 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The testcases had tableoid::regclass which outputs the foreign table's local
> name, which won't change across runs. Isn't that so?

[rhaas pgsql]$ grep 16444 ~/Downloads/postgres-fdw-syscol-zap-ab.patch
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN 0
END, CASE WHEN r1.* IS NOT NULL THEN 0 END, CASE WHEN r1.* IS NOT NULL
THEN 0 END, CASE WHEN r1.* IS NOT NULL THEN 16444 END, CASE WHEN r2.*
IS NOT NULL THEN 0 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END,
r2.c1 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 4" r2 ON (TRUE)) WHERE
((r1.c1 = r2.c1)) ORDER BY r1.c1 ASC NULLS LAST
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN
16444 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END, r2.c1 FROM ("S
1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY
r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN
16444 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END, r2.c1 FROM ("S
1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY
r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST

Where do you think 16444 and 16447 are coming from here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Ashutosh Bapat
Дата:


On Fri, Apr 15, 2016 at 10:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Apr 15, 2016 at 12:16 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The testcases had tableoid::regclass which outputs the foreign table's local
> name, which won't change across runs. Isn't that so?

[rhaas pgsql]$ grep 16444 ~/Downloads/postgres-fdw-syscol-zap-ab.patch
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN 0
END, CASE WHEN r1.* IS NOT NULL THEN 0 END, CASE WHEN r1.* IS NOT NULL
THEN 0 END, CASE WHEN r1.* IS NOT NULL THEN 16444 END, CASE WHEN r2.*
IS NOT NULL THEN 0 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END,
r2.c1 FROM ("S 1"."T 3" r1 INNER JOIN "S 1"."T 4" r2 ON (TRUE)) WHERE
((r1.c1 = r2.c1)) ORDER BY r1.c1 ASC NULLS LAST
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN
16444 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END, r2.c1 FROM ("S
1"."T 3" r1 LEFT JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY
r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST
+         Remote SQL: SELECT r1.c1, CASE WHEN r1.* IS NOT NULL THEN
16444 END, CASE WHEN r2.* IS NOT NULL THEN 16447 END, r2.c1 FROM ("S
1"."T 3" r1 FULL JOIN "S 1"."T 4" r2 ON (((r1.c1 = r2.c1)))) ORDER BY
r1.c1 ASC NULLS LAST, r2.c1 ASC NULLS LAST

Where do you think 16444 and 16447 are coming from here?

Ah! Sorry, it's the explain output.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/04/14 4:57, Robert Haas wrote:
> 1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
> before returning it from postgres_fdw, so that we don't expose the
> datum-tuple fields.   I can't see any reason this isn't safe, but I
> might be missing something.

I noticed odd behavior of this in EvalPlanQual.  Consider:

-- session 1
postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# create server fs foreign data wrapper postgres_fdw options
(dbname 'postgres');
CREATE SERVER
postgres=# create user mapping for public server fs;
CREATE USER MAPPING
postgres=# create table t1 (a int, b int);
CREATE TABLE
postgres=# create table t2 (a int, b int);
CREATE TABLE
postgres=# insert into t1 values (1, 1);
INSERT 0 1
postgres=# insert into t2 values (1, 1);
INSERT 0 1
postgres=# create foreign table ft1 (a int, b int) server fs options
(table_name 't1');
CREATE FOREIGN TABLE
postgres=# select xmin, xmax, cmin, * from ft1;
   xmin | xmax | cmin | a | b
------+------+------+---+---
      0 |    0 |    0 | 1 | 1
(1 row)

-- session 2
postgres=# begin;
BEGIN
postgres=# update t2 set a = a;
UPDATE 1

-- session 1
postgres=# select ft1.xmin, ft1.xmax, ft1.cmin, ft1.* from ft1, t2 for
update;

-- session 2
postgres=# commit;
COMMIT

-- session 1 (will show the following)
   xmin |    xmax    | cmin  | a | b
------+------------+-------+---+---
    128 | 4294967295 | 16398 | 1 | 1
(1 row)

The values of xmin, xmax, and cmin are not 0!  The reason for that is
that we don't zero these values in a test tuple stored by
EvalPlanQualFetchRowMarks for EvalPlanQual re-evaluations.

That cleanup applies to the file_fdw foreign table case as well, so I
think xmin, xmax, and cid in tuples from such tables should be set to 0,
too.  And ISTM that it's better that the core (ie, ForeignNext) supports
doing that, rather than each FDW does that work.  That would also
minimize the overhead because ForeignNext does that if needed.  Please
find attached a patch.

Best regards,
Etsuro Fujita

Вложения

Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Robert Haas
Дата:
On Fri, Jul 1, 2016 at 3:10 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> On 2016/04/14 4:57, Robert Haas wrote:
>>
>> 1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
>> before returning it from postgres_fdw, so that we don't expose the
>> datum-tuple fields.   I can't see any reason this isn't safe, but I
>> might be missing something.
>
> I noticed odd behavior of this in EvalPlanQual.  Consider:
>
> -- session 1
> postgres=# create extension postgres_fdw;
> CREATE EXTENSION
> postgres=# create server fs foreign data wrapper postgres_fdw options
> (dbname 'postgres');
> CREATE SERVER
> postgres=# create user mapping for public server fs;
> CREATE USER MAPPING
> postgres=# create table t1 (a int, b int);
> CREATE TABLE
> postgres=# create table t2 (a int, b int);
> CREATE TABLE
> postgres=# insert into t1 values (1, 1);
> INSERT 0 1
> postgres=# insert into t2 values (1, 1);
> INSERT 0 1
> postgres=# create foreign table ft1 (a int, b int) server fs options
> (table_name 't1');
> CREATE FOREIGN TABLE
> postgres=# select xmin, xmax, cmin, * from ft1;
>   xmin | xmax | cmin | a | b
> ------+------+------+---+---
>      0 |    0 |    0 | 1 | 1
> (1 row)
>
> -- session 2
> postgres=# begin;
> BEGIN
> postgres=# update t2 set a = a;
> UPDATE 1
>
> -- session 1
> postgres=# select ft1.xmin, ft1.xmax, ft1.cmin, ft1.* from ft1, t2 for
> update;
>
> -- session 2
> postgres=# commit;
> COMMIT
>
> -- session 1 (will show the following)
>   xmin |    xmax    | cmin  | a | b
> ------+------------+-------+---+---
>    128 | 4294967295 | 16398 | 1 | 1
> (1 row)
>
> The values of xmin, xmax, and cmin are not 0!  The reason for that is that
> we don't zero these values in a test tuple stored by
> EvalPlanQualFetchRowMarks for EvalPlanQual re-evaluations.
>
> That cleanup applies to the file_fdw foreign table case as well, so I think
> xmin, xmax, and cid in tuples from such tables should be set to 0, too.  And
> ISTM that it's better that the core (ie, ForeignNext) supports doing that,
> rather than each FDW does that work.  That would also minimize the overhead
> because ForeignNext does that if needed.  Please find attached a patch.

If you want this to be considered, you'll need to rebase it and submit
it to the next CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Odd system-column handling in postgres_fdw join pushdown patch

От
Etsuro Fujita
Дата:
On 2016/09/21 0:40, Robert Haas wrote:
> On Fri, Jul 1, 2016 at 3:10 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> On 2016/04/14 4:57, Robert Haas wrote:

>>> 1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
>>> before returning it from postgres_fdw, so that we don't expose the
>>> datum-tuple fields.   I can't see any reason this isn't safe, but I
>>> might be missing something.

>> I noticed odd behavior of this in EvalPlanQual.  Consider:
>>
>> -- session 1
>> postgres=# create extension postgres_fdw;
>> CREATE EXTENSION
>> postgres=# create server fs foreign data wrapper postgres_fdw options
>> (dbname 'postgres');
>> CREATE SERVER
>> postgres=# create user mapping for public server fs;
>> CREATE USER MAPPING
>> postgres=# create table t1 (a int, b int);
>> CREATE TABLE
>> postgres=# create table t2 (a int, b int);
>> CREATE TABLE
>> postgres=# insert into t1 values (1, 1);
>> INSERT 0 1
>> postgres=# insert into t2 values (1, 1);
>> INSERT 0 1
>> postgres=# create foreign table ft1 (a int, b int) server fs options
>> (table_name 't1');
>> CREATE FOREIGN TABLE
>> postgres=# select xmin, xmax, cmin, * from ft1;
>>   xmin | xmax | cmin | a | b
>> ------+------+------+---+---
>>      0 |    0 |    0 | 1 | 1
>> (1 row)
>>
>> -- session 2
>> postgres=# begin;
>> BEGIN
>> postgres=# update t2 set a = a;
>> UPDATE 1
>>
>> -- session 1
>> postgres=# select ft1.xmin, ft1.xmax, ft1.cmin, ft1.* from ft1, t2 for
>> update;
>>
>> -- session 2
>> postgres=# commit;
>> COMMIT
>>
>> -- session 1 (will show the following)
>>   xmin |    xmax    | cmin  | a | b
>> ------+------------+-------+---+---
>>    128 | 4294967295 | 16398 | 1 | 1
>> (1 row)
>>
>> The values of xmin, xmax, and cmin are not 0!  The reason for that is that
>> we don't zero these values in a test tuple stored by
>> EvalPlanQualFetchRowMarks for EvalPlanQual re-evaluations.
>>
>> That cleanup applies to the file_fdw foreign table case as well, so I think
>> xmin, xmax, and cid in tuples from such tables should be set to 0, too.  And
>> ISTM that it's better that the core (ie, ForeignNext) supports doing that,
>> rather than each FDW does that work.  That would also minimize the overhead
>> because ForeignNext does that if needed.  Please find attached a patch.

> If you want this to be considered, you'll need to rebase it and submit
> it to the next CommitFest.

Will do.

Best regards,
Etsuro Fujita