Обсуждение: Push down time-related SQLValue functions to foreign server
Hi. The attached patches allow pushing down current_timestamp/localtimestamp/current_time/localtime and now() to remote PostgreSQL server as locally computed parameters. The idea is based on oracle_fdw behavior. Examples. \d test Foreign table "public.test" Column | Type | Collation | Nullable | Default | FDW options --------+--------------------------+-----------+----------+---------+------------------- i | integer | | | | (column_name 'i') t | timestamp with time zone | | | | (column_name 't') Server: loopback FDW options: (schema_name 'data', table_name 'test') Prior the patch: explain verbose select * from test where t=current_timestamp; QUERY PLAN --------------------------------------------------------------------- Foreign Scan on public.test (cost=100.00..188.12 rows=11 width=12) Output: i, t Filter: (test.t = CURRENT_TIMESTAMP) Remote SQL: SELECT i, t FROM data.test explain verbose update test set t=current_timestamp where t<now(); QUERY PLAN ---------------------------------------------------------------------------- Update on public.test (cost=100.00..154.47 rows=0 width=0) Remote SQL: UPDATE data.test SET t = $2 WHERE ctid = $1 -> Foreign Scan on public.test (cost=100.00..154.47 rows=414 width=50) Output: CURRENT_TIMESTAMP, ctid, test.* Filter: (test.t < now()) Remote SQL: SELECT i, t, ctid FROM data.test FOR UPDATE After patch: explain verbose select * from test where t=current_timestamp; QUERY PLAN ------------------------------------------------------------------------------------- Foreign Scan on public.test (cost=100.00..144.35 rows=11 width=12) Output: i, t Remote SQL: SELECT i, t FROM data.test WHERE ((t = $1::timestamp with time zone)) explain verbose update test set t=current_timestamp where t<now(); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Update on public.test (cost=100.00..137.93 rows=0 width=0) -> Foreign Update on public.test (cost=100.00..137.93 rows=414 width=50) Remote SQL: UPDATE data.test SET t = $1::timestamp with time zone WHERE ((t < $1::timestamp with time zone)) -- Best regards, Alexander Pyhalov, Postgres Professional
Вложения
On Thu, Aug 19, 2021 at 2:52 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Hi.
The attached patches allow pushing down
current_timestamp/localtimestamp/current_time/localtime and now() to
remote PostgreSQL server as locally computed parameters.
The idea is based on oracle_fdw behavior.
Examples.
\d test
Foreign table "public.test"
Column | Type | Collation | Nullable | Default |
FDW options
--------+--------------------------+-----------+----------+---------+-------------------
i | integer | | | |
(column_name 'i')
t | timestamp with time zone | | | |
(column_name 't')
Server: loopback
FDW options: (schema_name 'data', table_name 'test')
Prior the patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
---------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..188.12 rows=11 width=12)
Output: i, t
Filter: (test.t = CURRENT_TIMESTAMP)
Remote SQL: SELECT i, t FROM data.test
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------
Update on public.test (cost=100.00..154.47 rows=0 width=0)
Remote SQL: UPDATE data.test SET t = $2 WHERE ctid = $1
-> Foreign Scan on public.test (cost=100.00..154.47 rows=414
width=50)
Output: CURRENT_TIMESTAMP, ctid, test.*
Filter: (test.t < now())
Remote SQL: SELECT i, t, ctid FROM data.test FOR UPDATE
After patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
-------------------------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..144.35 rows=11 width=12)
Output: i, t
Remote SQL: SELECT i, t FROM data.test WHERE ((t = $1::timestamp with
time zone))
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.test (cost=100.00..137.93 rows=0 width=0)
-> Foreign Update on public.test (cost=100.00..137.93 rows=414
width=50)
Remote SQL: UPDATE data.test SET t = $1::timestamp with time
zone WHERE ((t < $1::timestamp with time zone))
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi,
For 0002 patch:
+ return !(func_volatile(func_id) == PROVOLATILE_IMMUTABLE || func_id == F_NOW);
Did you mean to say 'now() is unstable' ?
Zhihong Yu писал 2021-08-19 13:22: > Hi, > For 0002 patch: > > + /* now() is stable, but we can ship it as it's replaced by > parameter */ > + return !(func_volatile(func_id) == PROVOLATILE_IMMUTABLE || > func_id == F_NOW); > > Did you mean to say 'now() is unstable' ? No, it's stable, not immutable, so we need additional check. -- Best regards, Alexander Pyhalov, Postgres Professional
On Thu, Aug 19, 2021 at 2:52 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Hi.
The attached patches allow pushing down
current_timestamp/localtimestamp/current_time/localtime and now() to
remote PostgreSQL server as locally computed parameters.
The idea is based on oracle_fdw behavior.
Examples.
\d test
Foreign table "public.test"
Column | Type | Collation | Nullable | Default |
FDW options
--------+--------------------------+-----------+----------+---------+-------------------
i | integer | | | |
(column_name 'i')
t | timestamp with time zone | | | |
(column_name 't')
Server: loopback
FDW options: (schema_name 'data', table_name 'test')
Prior the patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
---------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..188.12 rows=11 width=12)
Output: i, t
Filter: (test.t = CURRENT_TIMESTAMP)
Remote SQL: SELECT i, t FROM data.test
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------
Update on public.test (cost=100.00..154.47 rows=0 width=0)
Remote SQL: UPDATE data.test SET t = $2 WHERE ctid = $1
-> Foreign Scan on public.test (cost=100.00..154.47 rows=414
width=50)
Output: CURRENT_TIMESTAMP, ctid, test.*
Filter: (test.t < now())
Remote SQL: SELECT i, t, ctid FROM data.test FOR UPDATE
After patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
-------------------------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..144.35 rows=11 width=12)
Output: i, t
Remote SQL: SELECT i, t FROM data.test WHERE ((t = $1::timestamp with
time zone))
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.test (cost=100.00..137.93 rows=0 width=0)
-> Foreign Update on public.test (cost=100.00..137.93 rows=414
width=50)
Remote SQL: UPDATE data.test SET t = $1::timestamp with time
zone WHERE ((t < $1::timestamp with time zone))
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi,
For 0001 patch:
+ (s->op != SVFOP_CURRENT_TIMESTAMP_N) &&
+ (s->op != SVFOP_CURRENT_TIME) &&
...
The above check appears more than once. If extracted into a helper method, it would help reduce duplicate and make the code more readable.
Cheers
Em qui., 19 de ago. de 2021 às 07:50, Zhihong Yu <zyu@yugabyte.com> escreveu:
On Thu, Aug 19, 2021 at 2:52 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:Hi.
The attached patches allow pushing down
current_timestamp/localtimestamp/current_time/localtime and now() to
remote PostgreSQL server as locally computed parameters.
The idea is based on oracle_fdw behavior.
Examples.
\d test
Foreign table "public.test"
Column | Type | Collation | Nullable | Default |
FDW options
--------+--------------------------+-----------+----------+---------+-------------------
i | integer | | | |
(column_name 'i')
t | timestamp with time zone | | | |
(column_name 't')
Server: loopback
FDW options: (schema_name 'data', table_name 'test')
Prior the patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
---------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..188.12 rows=11 width=12)
Output: i, t
Filter: (test.t = CURRENT_TIMESTAMP)
Remote SQL: SELECT i, t FROM data.test
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------
Update on public.test (cost=100.00..154.47 rows=0 width=0)
Remote SQL: UPDATE data.test SET t = $2 WHERE ctid = $1
-> Foreign Scan on public.test (cost=100.00..154.47 rows=414
width=50)
Output: CURRENT_TIMESTAMP, ctid, test.*
Filter: (test.t < now())
Remote SQL: SELECT i, t, ctid FROM data.test FOR UPDATE
After patch:
explain verbose select * from test where t=current_timestamp;
QUERY PLAN
-------------------------------------------------------------------------------------
Foreign Scan on public.test (cost=100.00..144.35 rows=11 width=12)
Output: i, t
Remote SQL: SELECT i, t FROM data.test WHERE ((t = $1::timestamp with
time zone))
explain verbose update test set t=current_timestamp where t<now();
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.test (cost=100.00..137.93 rows=0 width=0)
-> Foreign Update on public.test (cost=100.00..137.93 rows=414
width=50)
Remote SQL: UPDATE data.test SET t = $1::timestamp with time
zone WHERE ((t < $1::timestamp with time zone))
--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalHi,For 0001 patch:+ if ((s->op != SVFOP_CURRENT_TIMESTAMP) &&
+ (s->op != SVFOP_CURRENT_TIMESTAMP_N) &&+ (s->op != SVFOP_CURRENT_TIME) &&...The above check appears more than once. If extracted into a helper method, it would help reduce duplicate and make the code more readable.
Perhaps in a MACRO?
regards,
Ranier Vilela
I spent some time looking at this patch. Generally it looks like a good idea. These stable functions will be evaluated at the execution time and replaced with constants. I am not sure whether the nodes saved in the param_list may not get the same treatment. Have you verified that? Also the new node types being added to the param list is something other than Param. So it conflicts with the comment below in prepare_query_params()? /* * Prepare remote-parameter expressions for evaluation. (Note: in * practice, we expect that all these expressions will be just Params, so * we could possibly do something more efficient than using the full * expression-eval machinery for this. But probably there would be little * benefit, and it'd require postgres_fdw to know more than is desirable * about Param evaluation.) */ If we are already adding non-params to this list, then the comment is outdated? On Thu, Aug 19, 2021 at 3:22 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > > Hi. > > The attached patches allow pushing down > current_timestamp/localtimestamp/current_time/localtime and now() to > remote PostgreSQL server as locally computed parameters. > The idea is based on oracle_fdw behavior. > > Examples. > > \d test > Foreign table "public.test" > Column | Type | Collation | Nullable | Default | > FDW options > --------+--------------------------+-----------+----------+---------+------------------- > i | integer | | | | > (column_name 'i') > t | timestamp with time zone | | | | > (column_name 't') > Server: loopback > FDW options: (schema_name 'data', table_name 'test') > > Prior the patch: > > explain verbose select * from test where t=current_timestamp; > QUERY PLAN > --------------------------------------------------------------------- > Foreign Scan on public.test (cost=100.00..188.12 rows=11 width=12) > Output: i, t > Filter: (test.t = CURRENT_TIMESTAMP) > Remote SQL: SELECT i, t FROM data.test > > explain verbose update test set t=current_timestamp where t<now(); > QUERY PLAN > ---------------------------------------------------------------------------- > Update on public.test (cost=100.00..154.47 rows=0 width=0) > Remote SQL: UPDATE data.test SET t = $2 WHERE ctid = $1 > -> Foreign Scan on public.test (cost=100.00..154.47 rows=414 > width=50) > Output: CURRENT_TIMESTAMP, ctid, test.* > Filter: (test.t < now()) > Remote SQL: SELECT i, t, ctid FROM data.test FOR UPDATE > > > After patch: > explain verbose select * from test where t=current_timestamp; > QUERY PLAN > ------------------------------------------------------------------------------------- > Foreign Scan on public.test (cost=100.00..144.35 rows=11 width=12) > Output: i, t > Remote SQL: SELECT i, t FROM data.test WHERE ((t = $1::timestamp with > time zone)) > > explain verbose update test set t=current_timestamp where t<now(); > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------- > Update on public.test (cost=100.00..137.93 rows=0 width=0) > -> Foreign Update on public.test (cost=100.00..137.93 rows=414 > width=50) > Remote SQL: UPDATE data.test SET t = $1::timestamp with time > zone WHERE ((t < $1::timestamp with time zone)) > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional -- Best Wishes, Ashutosh Bapat
Hi. Ranier Vilela писал 2021-08-19 14:01: > Em qui., 19 de ago. de 2021 às 07:50, Zhihong Yu <zyu@yugabyte.com> >> Hi, >> For 0001 patch: >> >> + if ((s->op != SVFOP_CURRENT_TIMESTAMP) && >> + (s->op != SVFOP_CURRENT_TIMESTAMP_N) && >> + (s->op != SVFOP_CURRENT_TIME) && >> ... >> >> The above check appears more than once. If extracted into a helper >> method, it would help reduce duplicate and make the code more >> readable. > > Perhaps in a MACRO? Changed this check to a macro, also fixed condition in is_foreign_param() and added test for it. Also fixed comment in prepare_query_params(). -- Best regards, Alexander Pyhalov, Postgres Professional
Вложения
Hi. Ashutosh Bapat писал 2021-08-19 17:01: > I spent some time looking at this patch. > > Generally it looks like a good idea. These stable functions will be > evaluated at the execution time and replaced with constants. I am not > sure whether the nodes saved in the param_list may not get the same > treatment. Have you verified that? I'm not sure I understand you. All parameters are treated in the same way. They are evaluated in process_query_params(), real params and parameters, corresponding to our SQLValue functions. If we look at execution of something like explain verbose select * from test1 t1 where i in (select i from test1 t2 where t2.t< now() and t1.i=t2.i) ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Foreign Scan on public.test1 t1 (cost=100.00..243310.11 rows=930 width=20) Output: t1.i, t1.t, t1.l Filter: (SubPlan 1) Remote SQL: SELECT i, t, l FROM data.test1 SubPlan 1 -> Foreign Scan on public.test1 t2 (cost=100.00..161.29 rows=5 width=4) Output: t2.i Remote SQL: SELECT i FROM data.test1 WHERE (($1::integer = i)) AND ((t < $2::timestamp with time zone) we can see two parameters evaluated in process_query_params(), one - of T_Param type (with value of current t1.i) and one of T_SQLValueFunction type (with value of current_timestamp). > > Also the new node types being added to the param list is something > other than Param. So it conflicts with the comment below in > prepare_query_params()? > /* > * Prepare remote-parameter expressions for evaluation. (Note: in > * practice, we expect that all these expressions will be just > Params, so > * we could possibly do something more efficient than using the > full > * expression-eval machinery for this. But probably there would be > little > * benefit, and it'd require postgres_fdw to know more than is > desirable > * about Param evaluation.) > */ > If we are already adding non-params to this list, then the comment is > outdated? Fixed comment in the new version of the patches. -- Best regards, Alexander Pyhalov, Postgres Professional
Em sex., 20 de ago. de 2021 às 04:13, Alexander Pyhalov <a.pyhalov@postgrespro.ru> escreveu:
Hi.
Ranier Vilela писал 2021-08-19 14:01:
> Em qui., 19 de ago. de 2021 às 07:50, Zhihong Yu <zyu@yugabyte.com>
>> Hi,
>> For 0001 patch:
>>
>> + if ((s->op != SVFOP_CURRENT_TIMESTAMP) &&
>> + (s->op != SVFOP_CURRENT_TIMESTAMP_N) &&
>> + (s->op != SVFOP_CURRENT_TIME) &&
>> ...
>>
>> The above check appears more than once. If extracted into a helper
>> method, it would help reduce duplicate and make the code more
>> readable.
>
> Perhaps in a MACRO?
Changed this check to a macro, also fixed condition in
is_foreign_param() and added test for it.
Also fixed comment in prepare_query_params().
Thanks.
Another question:
For 0002 patch:
+ if (node->funcid == F_NOW)
+ {
+ SQLValueFunction *svf = makeNode(SQLValueFunction);
+
+ svf->op = SVFOP_CURRENT_TIMESTAMP;
+ svf->type = TIMESTAMPTZOID;
+ svf->typmod = -1;
+ svf->location = -1;
+
+ deparseSQLValueFunction(svf, context);
+
+ return;
+ }
+ {
+ SQLValueFunction *svf = makeNode(SQLValueFunction);
+
+ svf->op = SVFOP_CURRENT_TIMESTAMP;
+ svf->type = TIMESTAMPTZOID;
+ svf->typmod = -1;
+ svf->location = -1;
+
+ deparseSQLValueFunction(svf, context);
+
+ return;
+ }
It seems to me that the svf->xpr field ( SQLValueFunction *svf ) is not initialized somewhere even by deparseSQLValueFunction.
If it's not really used, it should be initialized to NULL, ok?
If it's not really used, it should be initialized to NULL, ok?
regards,
Ranier Vilela
Ranier Vilela писал 2021-08-20 14:19: > Another question: > For 0002 patch: > > + if (node->funcid == F_NOW) > + { > + SQLValueFunction *svf = makeNode(SQLValueFunction); > + > + svf->op = SVFOP_CURRENT_TIMESTAMP; > + svf->type = TIMESTAMPTZOID; > + svf->typmod = -1; > + svf->location = -1; > + > + deparseSQLValueFunction(svf, context); > + > + return; > + } > It seems to me that the svf->xpr field ( SQLValueFunction *svf ) is > not initialized somewhere even by deparseSQLValueFunction. > If it's not really used, it should be initialized to NULL, ok? > xpr field just carries node type, which will be initialized by makeNode(). -- Best regards, Alexander Pyhalov, Postgres Professional
On Fri, Aug 20, 2021 at 12:13 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Hi.
Ranier Vilela писал 2021-08-19 14:01:
> Em qui., 19 de ago. de 2021 às 07:50, Zhihong Yu <zyu@yugabyte.com>
>> Hi,
>> For 0001 patch:
>>
>> + if ((s->op != SVFOP_CURRENT_TIMESTAMP) &&
>> + (s->op != SVFOP_CURRENT_TIMESTAMP_N) &&
>> + (s->op != SVFOP_CURRENT_TIME) &&
>> ...
>>
>> The above check appears more than once. If extracted into a helper
>> method, it would help reduce duplicate and make the code more
>> readable.
>
> Perhaps in a MACRO?
Changed this check to a macro, also fixed condition in
is_foreign_param() and added test for it.
Also fixed comment in prepare_query_params().
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi,
The patches are good by me.
Thanks
Em sex., 20 de ago. de 2021 às 09:18, Alexander Pyhalov <a.pyhalov@postgrespro.ru> escreveu:
Ranier Vilela писал 2021-08-20 14:19:
> Another question:
> For 0002 patch:
>
> + if (node->funcid == F_NOW)
> + {
> + SQLValueFunction *svf = makeNode(SQLValueFunction);
> +
> + svf->op = SVFOP_CURRENT_TIMESTAMP;
> + svf->type = TIMESTAMPTZOID;
> + svf->typmod = -1;
> + svf->location = -1;
> +
> + deparseSQLValueFunction(svf, context);
> +
> + return;
> + }
> It seems to me that the svf->xpr field ( SQLValueFunction *svf ) is
> not initialized somewhere even by deparseSQLValueFunction.
> If it's not really used, it should be initialized to NULL, ok?
>
xpr field just carries node type, which will be initialized by
makeNode().
Great, I missed it.
regards,
Ranier Vilela
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> Perhaps in a MACRO? > Changed this check to a macro, also fixed condition in > is_foreign_param() and added test for it. > Also fixed comment in prepare_query_params(). I took a quick look at this. I'm unconvinced that you need the TIME_RELATED_SQLVALUE_FUNCTION macro, mainly because I think testing that in is_foreign_param() is pointless. The only way we'll be seeing a SQLValueFunction in is_foreign_param() is if we decided it was shippable, so you really don't need two duplicate tests. (In the same vein, I would not bother with including a switch in deparseSQLValueFunction that knows about these opcodes explicitly. Just use the typmod field; exprTypmod() does.) I also find it pretty bizarre that contain_unsafe_functions isn't placed beside its one caller. Maybe that indicates that is_foreign_expr is mis-located and should be in shippable.c. More generally, it's annoying that you had to copy-and-paste all of contain_mutable_functions to make this. That creates a rather nasty maintenance hazard for future hackers, who will need to keep these widely-separated functions in sync. Not sure what to do about that though. Do we want to extend contain_mutable_functions itself to cover this use-case? The test cases seem a bit overkill --- what is the point of the two nigh-identical PREPARE tests, or the GROUP BY test? If anything is broken about GROUP BY, surely it's not specific to this patch. I'm not really convinced by the premise of 0002, particularly this bit: static bool -contain_mutable_functions_checker(Oid func_id, void *context) +contain_unsafe_functions_checker(Oid func_id, void *context) { - return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE); + /* now() is stable, but we can ship it as it's replaced by parameter */ + return !(func_volatile(func_id) == PROVOLATILE_IMMUTABLE || func_id == F_NOW); } The point of the check_functions_in_node callback API is to verify the mutability of functions that are embedded in various sorts of expression nodes ... but they might not be in a plain FuncExpr node, which is the only case you'll deparse correctly. It might be that now() cannot legally appear in any of the other node types that check_functions_in_node knows about, but I'm not quite convinced of that, and even less convinced that that'll stay true as we add more expression node types. Also, if we commit this, for sure some poor soul will try to expand the logic to some other stable function that *can* appear in those contexts, and that'll be broken. The implementation of converting now() to CURRENT_TIMESTAMP seems like an underdocumented kluge, too. On the whole I'm a bit inclined to drop 0002; I'm not sure it's worth the trouble. regards, tom lane
The implementation of converting now() to CURRENT_TIMESTAMP
seems like an underdocumented kluge, too.
I'm very late to the party, but it seems to me that this effort is describing a small subset of what "routine mapping" seems to be for: defining function calls that can be pushed down to the foreign server, and the analogous function on the foreign side. We may want to consider implementing just enough of CREATE ROUTINE MAPPING and DROP ROUTINE MAPPING to support these specific fixed functions.
Corey Huinker <corey.huinker@gmail.com> writes: > I'm very late to the party, but it seems to me that this effort is > describing a small subset of what "routine mapping" seems to be for: > defining function calls that can be pushed down to the foreign server, and > the analogous function on the foreign side. We may want to consider > implementing just enough of CREATE ROUTINE MAPPING and DROP ROUTINE MAPPING > to support these specific fixed functions. Hmm ... not really, because for these particular functions, the point is exactly that we *don't* translate them to some function call on the remote end. We evaluate them locally and push the resulting constant to the far side, thus avoiding issues like clock skew. Having said that: why couldn't that implementation sketch be used for ANY stable subexpression? What's special about the datetime SQLValueFunctions? regards, tom lane
Hmm ... not really, because for these particular functions, the
point is exactly that we *don't* translate them to some function
call on the remote end. We evaluate them locally and push the
resulting constant to the far side, thus avoiding issues like
clock skew.
Ah, my pattern matching brain was so excited to see a use for routine mapping that I didn't notice that important detail.
Hi. Tom Lane писал 2022-01-18 02:08: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >>> Perhaps in a MACRO? > >> Changed this check to a macro, also fixed condition in >> is_foreign_param() and added test for it. >> Also fixed comment in prepare_query_params(). > > I took a quick look at this. I'm unconvinced that you need the > TIME_RELATED_SQLVALUE_FUNCTION macro, mainly because I think testing > that in is_foreign_param() is pointless. The only way we'll be seeing > a SQLValueFunction in is_foreign_param() is if we decided it was > shippable, so you really don't need two duplicate tests. > (In the same vein, I would not bother with including a switch in > deparseSQLValueFunction that knows about these opcodes explicitly. > Just use the typmod field; exprTypmod() does.) Yes, sure, is_foreign_param() is called only when is_foreign_expr() is true. Simplified this part. > > I also find it pretty bizarre that contain_unsafe_functions > isn't placed beside its one caller. Maybe that indicates that > is_foreign_expr is mis-located and should be in shippable.c. > > More generally, it's annoying that you had to copy-and-paste > all of contain_mutable_functions to make this. That creates > a rather nasty maintenance hazard for future hackers, who will > need to keep these widely-separated functions in sync. Not > sure what to do about that though. Do we want to extend > contain_mutable_functions itself to cover this use-case? I've moved logic to contain_mutable_functions_skip_sqlvalues(), it uses the same subroutines as contain_mutable_functions(). Should we instead just add one more parameter to contain_mutable_functions()? I'm not sure that it's a good idea given that contain_mutable_functions() seems to be an external interface. > > The test cases seem a bit overkill --- what is the point of the > two nigh-identical PREPARE tests, or the GROUP BY test? If > anything is broken about GROUP BY, surely it's not specific > to this patch. I've removed PREPARE tests, but GROUP BY test checks foreign_grouping_ok()/is_foreign_param() path, so I think it's useful. > > I'm not really convinced by the premise of 0002, particularly > this bit: > > static bool > -contain_mutable_functions_checker(Oid func_id, void *context) > +contain_unsafe_functions_checker(Oid func_id, void *context) > { > - return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE); > + /* now() is stable, but we can ship it as it's replaced by parameter > */ > + return !(func_volatile(func_id) == PROVOLATILE_IMMUTABLE || func_id > == F_NOW); > } > > The point of the check_functions_in_node callback API is to verify > the mutability of functions that are embedded in various sorts of > expression nodes ... but they might not be in a plain FuncExpr node, > which is the only case you'll deparse correctly. It might be that > now() cannot legally appear in any of the other node types that > check_functions_in_node knows about, but I'm not quite convinced > of that, and even less convinced that that'll stay true as we add > more expression node types. Also, if we commit this, for sure > some poor soul will try to expand the logic to some other stable > function that *can* appear in those contexts, and that'll be broken. > > The implementation of converting now() to CURRENT_TIMESTAMP > seems like an underdocumented kluge, too. > > On the whole I'm a bit inclined to drop 0002; I'm not sure it's > worth the trouble. > OK. Let's drop it for now. -- Best regards, Alexander Pyhalov, Postgres Professional
Вложения
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > [ updated patch ] Thanks for updating the patch. (BTW, please attach version numbers to new patch versions, to avoid confusion.) However, before we proceed any further with this patch, I think we really ought to stop and think about the question I raised last night: why are we building a one-off feature for SQLValueFunction? Wouldn't the same parameter-substitution mechanism work for *any* stable expression that doesn't contain remote Vars? That would subsume the now() case as well as plenty of others. So far the only counterexample I've been able to come up with is that shipping values of reg* types might not be too safe, because the remote side might not have the same objects. For example consider these two potential quals: WHERE remote_oid_column = CURRENT_ROLE::regrole WHERE remote_text_column = CURRENT_ROLE::text Say we're running as user 'joe' and that role doesn't exist on the remote server. Then executing the first WHERE locally is fine, but shipping it to the remote would cause a failure because the remote's regrolein() will fail to convert the parameter value. But the second case is quite non-problematic, because what will be sent over is just some uninterpreted text. In point of fact, this hazard doesn't have anything to do with stable or not-stable subexpressions --- for example, WHERE remote_oid_column = 'joe'::regrole is just as unsafe, even though the value under consideration is a *constant*. Maybe there is something in postgres_fdw that would stop it from shipping this qual, but I don't recall seeing it, so I wonder if there's a pre-existing bug here. So it seems like we need a check to prevent generating remote Params that are of "unsafe" types, but this is a type issue not an expression issue --- as long as an expression is stable and does not yield an unsafe-to-ship data type, why can't we treat it as a Param? regards, tom lane
Tom Lane писал 2022-01-18 19:53: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> [ updated patch ] > > Thanks for updating the patch. (BTW, please attach version numbers > to new patch versions, to avoid confusion.) > > However, before we proceed any further with this patch, I think we > really ought to stop and think about the question I raised last > night: why are we building a one-off feature for SQLValueFunction? > Wouldn't the same parameter-substitution mechanism work for *any* > stable expression that doesn't contain remote Vars? That would > subsume the now() case as well as plenty of others. > Hi. I think, I can extend it to allow any stable function (not just immutable/sqlvalues) in is_foreign_expr(), but not so sure about "expression". Perhaps, at top of deparseExpr() we can check that expression doesn't contain vars, params, but contains stable function, and deparse it as param. This means we'll translate something like explain select * from t where d > now() - '1 day'::interval; to select * from t where d > $1; The question is how will we reliably determine its typmod (given that I read in exprTypmod() comment "returns the type-specific modifier of the expression's result type, if it can be determined. In many cases, it can't". What do we save if we don't ship whole expression as param, but only stable functions? Allowing them seems to be more straightforward. -- Best regards, Alexander Pyhalov, Postgres Professional
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > Tom Lane писал 2022-01-18 19:53: >> However, before we proceed any further with this patch, I think we >> really ought to stop and think about the question I raised last >> night: why are we building a one-off feature for SQLValueFunction? >> Wouldn't the same parameter-substitution mechanism work for *any* >> stable expression that doesn't contain remote Vars? That would >> subsume the now() case as well as plenty of others. > This means we'll translate something like > explain select * from t where d > now() - '1 day'::interval; > to > select * from t where d > $1; Right. > The question is how will we reliably determine its typmod (given that I > read in exprTypmod() comment > "returns the type-specific modifier of the expression's result type, if > it can be determined. In many cases, it can't". I don't think we need to. If exprTypmod() says the typmod is -1, then that's what it is. > What do we save if we don't ship whole expression as param, but only > stable functions? Allowing them seems to be more straightforward. How so? Right off the bat, you get rid of the need for your own version of contain_mutable_function. ISTM this approach can probably be implemented in a patch that's noticeably smaller than what you have now. It'll likely be touching entirely different places in postgres_fdw, of course. regards, tom lane
I wrote: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> This means we'll translate something like >> explain select * from t where d > now() - '1 day'::interval; >> to >> select * from t where d > $1; > Right. After thinking about that a bit more, I see that this will result in a major redefinition of what is "shippable". Right now, we do not consider the above WHERE clause to be shippable, not only because of now() but because the timestamptz-minus-interval operator is dependent on the timezone setting, which might be different at the remote. But if we evaluate that operator locally and send its result as a parameter, the objection vanishes. In fact, I don't think we even need to require the subexpression to contain only built-in functions. Its result still has to be of a built-in type, but that's a much weaker restriction. So this is going to require significant restructuring of both is_foreign_expr and deparse.c, which I realize may be more than you bargained for. But if you want to tackle it, I think what we want to do is divide potentially-shippable expressions into three sorts of components: 1. Remote Vars, which obviously ship as-is. 2. Locally-evaluatable subexpressions, which must contain no remote Vars, must be stable or immutable, and must have a result type that we consider safe to ship. If such a subexpression is just a Const, we ship it as a constant, otherwise we evaluate the value at runtime and ship a parameter. 3. Superstructure, which consists of all expression nodes having at least one remote Var below them. These nodes have to be shippable according to the existing definition, so that we know that the remote server will evaluate them just as we would. If we keep the existing division of labor between is_foreign_expr and deparse.c, I fear that there's going to be a lot of duplicated code as well as duplicative planning effort. I wonder if it'd be wise to combine those operations into a single expression-scanning process that determines whether an expression is safe to ship and produces the translated string immediately if it is. regards, tom lane
Tom Lane писал 2022-01-18 23:01: > I wrote: >> Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >>> This means we'll translate something like >>> explain select * from t where d > now() - '1 day'::interval; >>> to >>> select * from t where d > $1; > >> Right. > > After thinking about that a bit more, I see that this will result > in a major redefinition of what is "shippable". Right now, we do not > consider the above WHERE clause to be shippable, not only because of > now() but because the timestamptz-minus-interval operator is dependent > on the timezone setting, which might be different at the remote. > But if we evaluate that operator locally and send its result as a > parameter, the objection vanishes. In fact, I don't think we even > need to require the subexpression to contain only built-in functions. > Its result still has to be of a built-in type, but that's a much > weaker restriction. > Hi. So far I have the following prototype. It seems to be working, but I think it can be enhanced. At least, some sort of caching seems to be necessary for is_stable_expr(). 1) Now expression can be either 'stable shippable' or 'shippable according to old rules'. We check if it's 'stable shippable' in foreign_expr_walker(), is_foreign_param() and deparseExpr(). All such exprs are replaced by params while deparsing. 2) contain_mutable_functions() now is calculated only for current node, if node is not considered 'stable shippable'. Is it step in the right direction or do I miss something? -- Best regards, Alexander Pyhalov, Postgres Professional
Вложения
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > So far I have the following prototype. It seems to be working, but I > think it can be enhanced. > At least, some sort of caching seems to be necessary for > is_stable_expr(). Yeah, from a performance standpoint this seems pretty horrid --- it's probably exponential in the size of the expressions considered because of the repeated recursions. The approach I had in mind was to extend the responsibility of foreign_expr_walker to make it deduce the classification of an expression in a bottom-up fashion. Also, as I noted before, we don't want to re-deduce that in a later deparse pass, so maybe we should just go ahead and deparse during foreign_expr_walker. Not sure about that part. It sounds expensive; but recording the classification results in a way we could re-use later doesn't seem too cheap either. BTW, the patch is just about unreadable as-is. It would have been better to not have re-indented the bulk of foreign_expr_walker, leaving that for some later pgindent pass. But that may be moot, because I don't think we can tolerate the double-recursion approach you took here. regards, tom lane
This entry has been waiting on author input for a while (our current threshold is roughly two weeks), so I've marked it Returned with Feedback. Once you think the patchset is ready for review again, you (or any interested party) can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3289/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the second step; hopefully we will have streamlined this in the near future!) Thanks, --Jacob