Re: Making CASE error handling less surprising

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Making CASE error handling less surprising
Дата
Msg-id 433224.1595604130@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Making CASE error handling less surprising  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> Ah, I see what you mean.  Yeah, that throws an error today, and it
> still would with the patch I was envisioning (attached), because
> inlining does Param substitution in a different way.  I'm not
> sure that we could realistically fix the inlining case with this
> sort of approach.

Here's another example that we can't possibly fix with Param substitution
hacking, because there are no Params involved in the first place:

select f1, case when f1 = 42 then 1/i else null end
from (select f1, 0 as i from int4_tbl) ss;

Pulling up the subquery results in "1/0", so this fails today,
even though "f1 = 42" is never true.

Attached is a v3 patch that incorporates the leakproofness idea.
As shown in the new case.sql tests, this does fix both the SQL
function and subquery-pullup cases.

To keep the join regression test results the same, I marked int48()
as leakproof, which is surely safe enough.  Probably we should make
a push to mark all unconditionally-safe implicit coercions as
leakproof, but that's a separate matter.

            regards, tom lane

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index e04b144072..48e31b16d6 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -61,6 +61,16 @@ typedef struct
     AggClauseCosts *costs;
 } get_agg_clause_costs_context;

+typedef enum
+{
+    /* Ordering is important here! */
+    ece_eval_nothing,            /* be unconditionally safe */
+    ece_eval_leakproof,            /* eval leakproof immutable functions */
+    ece_eval_immutable,            /* eval immutable functions too */
+    ece_eval_stable,            /* eval stable functions too */
+    ece_eval_volatile            /* eval volatile functions too */
+} ece_eval_mode;
+
 typedef struct
 {
     ParamListInfo boundParams;
@@ -68,6 +78,7 @@ typedef struct
     List       *active_fns;
     Node       *case_val;
     bool        estimate;
+    ece_eval_mode eval_mode;
 } eval_const_expressions_context;

 typedef struct
@@ -119,6 +130,8 @@ static Node *eval_const_expressions_mutator(Node *node,
 static bool contain_non_const_walker(Node *node, void *context);
 static bool ece_function_is_safe(Oid funcid,
                                  eval_const_expressions_context *context);
+static bool ece_function_form_is_safe(Form_pg_proc func_form,
+                                      eval_const_expressions_context *context);
 static Node *apply_const_relabel(Node *arg, Oid rtype,
                                  int32 rtypmod, Oid rcollid,
                                  CoercionForm rformat, int rlocation);
@@ -2264,6 +2277,7 @@ eval_const_expressions(PlannerInfo *root, Node *node)
     context.active_fns = NIL;    /* nothing being recursively simplified */
     context.case_val = NULL;    /* no CASE being examined */
     context.estimate = false;    /* safe transformations only */
+    context.eval_mode = ece_eval_immutable; /* eval immutable functions */
     return eval_const_expressions_mutator(node, &context);
 }

@@ -2280,8 +2294,11 @@ eval_const_expressions(PlannerInfo *root, Node *node)
  *      available by the caller of planner(), even if the Param isn't marked
  *      constant.  This effectively means that we plan using the first supplied
  *      value of the Param.
- * 2. Fold stable, as well as immutable, functions to constants.
+ * 2. Fold stable, as well as immutable, functions to constants.  The risk
+ *      that the result might change from planning time to execution time is
+ *      worth taking, as we otherwise couldn't get an estimate at all.
  * 3. Reduce PlaceHolderVar nodes to their contained expressions.
+ * 4. Ignore domain constraints, assuming that CoerceToDomain will succeed.
  *--------------------
  */
 Node *
@@ -2295,6 +2312,7 @@ estimate_expression_value(PlannerInfo *root, Node *node)
     context.active_fns = NIL;    /* nothing being recursively simplified */
     context.case_val = NULL;    /* no CASE being examined */
     context.estimate = true;    /* unsafe transformations OK */
+    context.eval_mode = ece_eval_stable;    /* eval stable functions */
     return eval_const_expressions_mutator(node, &context);
 }

@@ -2961,7 +2979,22 @@ eval_const_expressions_mutator(Node *node,
                  * opportunity to reduce constant test conditions.  For
                  * example this allows
                  *        CASE 0 WHEN 0 THEN 1 ELSE 1/0 END
-                 * to reduce to 1 rather than drawing a divide-by-0 error.
+                 * to reduce to 1 rather than drawing a divide-by-0 error,
+                 * since we'll throw away the ELSE without processing it.
+                 *
+                 * Even in not-all-constant cases, the user might be expecting
+                 * the CASE to prevent run-time errors, for example in
+                 *        CASE WHEN j > 0 THEN 1 ELSE 1/0 END
+                 * Since division is immutable, we'd ordinarily simplify the
+                 * division and hence draw the divide-by-zero error at plan
+                 * time, even if j is never zero at run time.  To avoid that,
+                 * reduce eval_mode to ece_eval_leakproof while processing any
+                 * test condition or result value that will not certainly be
+                 * evaluated at run-time.  We expect that leakproof immutable
+                 * functions will not throw any errors (except perhaps in
+                 * corner cases such as OOM, which we need not guard against
+                 * for this purpose).
+                 *
                  * Note that when the test expression is constant, we don't
                  * have to include it in the resulting CASE; for example
                  *        CASE 0 WHEN x THEN y ELSE z END
@@ -2977,6 +3010,7 @@ eval_const_expressions_mutator(Node *node,
                  */
                 CaseExpr   *caseexpr = (CaseExpr *) node;
                 CaseExpr   *newcase;
+                ece_eval_mode save_eval_mode = context->eval_mode;
                 Node       *save_case_val;
                 Node       *newarg;
                 List       *newargs;
@@ -3027,6 +3061,16 @@ eval_const_expressions_mutator(Node *node,
                         const_true_cond = true;
                     }

+                    /*
+                     * Unless the test condition is constant TRUE, we can't be
+                     * sure the result value will be evaluated, so become more
+                     * conservative about which functions to execute.  This
+                     * change will also apply to subsequent test conditions
+                     * and result values.
+                     */
+                    if (!const_true_cond)
+                        context->eval_mode = ece_eval_leakproof;
+
                     /* Simplify this alternative's result value */
                     caseresult = eval_const_expressions_mutator((Node *) oldcasewhen->result,
                                                                 context);
@@ -3058,6 +3102,7 @@ eval_const_expressions_mutator(Node *node,
                                                                context);

                 context->case_val = save_case_val;
+                context->eval_mode = save_eval_mode;

                 /*
                  * If no non-FALSE alternatives, CASE reduces to the default
@@ -3113,6 +3158,7 @@ eval_const_expressions_mutator(Node *node,
             {
                 CoalesceExpr *coalesceexpr = (CoalesceExpr *) node;
                 CoalesceExpr *newcoalesce;
+                ece_eval_mode save_eval_mode = context->eval_mode;
                 List       *newargs;
                 ListCell   *arg;

@@ -3137,13 +3183,25 @@ eval_const_expressions_mutator(Node *node,
                         if (((Const *) e)->constisnull)
                             continue;    /* drop null constant */
                         if (newargs == NIL)
+                        {
+                            context->eval_mode = save_eval_mode;
                             return e;    /* first expr */
+                        }
                         newargs = lappend(newargs, e);
                         break;
                     }
                     newargs = lappend(newargs, e);
+
+                    /*
+                     * Arguments following a non-constant argument may or may
+                     * not get evaluated at run-time, so as in CASE, become
+                     * more conservative about which functions to execute.
+                     */
+                    context->eval_mode = ece_eval_leakproof;
                 }

+                context->eval_mode = save_eval_mode;
+
                 /*
                  * If all the arguments were constant null, the result is just
                  * null
@@ -3163,13 +3221,12 @@ eval_const_expressions_mutator(Node *node,
         case T_SQLValueFunction:
             {
                 /*
-                 * All variants of SQLValueFunction are stable, so if we are
-                 * estimating the expression's value, we should evaluate the
-                 * current function value.  Otherwise just copy.
+                 * All variants of SQLValueFunction are stable, so evaluate if
+                 * we are evaluating stable functions.  Otherwise just copy.
                  */
                 SQLValueFunction *svf = (SQLValueFunction *) node;

-                if (context->estimate)
+                if (context->eval_mode >= ece_eval_stable)
                     return (Node *) evaluate_expr((Expr *) svf,
                                                   svf->type,
                                                   svf->typmod,
@@ -3565,20 +3622,42 @@ contain_non_const_walker(Node *node, void *context)
 static bool
 ece_function_is_safe(Oid funcid, eval_const_expressions_context *context)
 {
-    char        provolatile = func_volatile(funcid);
+    bool        result;
+    HeapTuple    tp;

-    /*
-     * Ordinarily we are only allowed to simplify immutable functions. But for
-     * purposes of estimation, we consider it okay to simplify functions that
-     * are merely stable; the risk that the result might change from planning
-     * time to execution time is worth taking in preference to not being able
-     * to estimate the value at all.
-     */
-    if (provolatile == PROVOLATILE_IMMUTABLE)
-        return true;
-    if (context->estimate && provolatile == PROVOLATILE_STABLE)
-        return true;
-    return false;
+    tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+    if (!HeapTupleIsValid(tp))
+        elog(ERROR, "cache lookup failed for function %u", funcid);
+
+    result = ece_function_form_is_safe((Form_pg_proc) GETSTRUCT(tp), context);
+    ReleaseSysCache(tp);
+    return result;
+}
+
+/*
+ * Same, when we have the pg_proc entry directly at hand
+ */
+static bool
+ece_function_form_is_safe(Form_pg_proc func_form,
+                          eval_const_expressions_context *context)
+{
+    ece_eval_mode f_mode;
+
+    /* Must map the function properties to an ordered enum */
+    if (func_form->provolatile == PROVOLATILE_IMMUTABLE)
+    {
+        if (func_form->proleakproof)
+            f_mode = ece_eval_leakproof;
+        else
+            f_mode = ece_eval_immutable;
+    }
+    else if (func_form->provolatile == PROVOLATILE_STABLE)
+        f_mode = ece_eval_stable;
+    else
+        f_mode = ece_eval_volatile;
+
+    /* Now, does eval_mode allow evaluation of this function? */
+    return (context->eval_mode >= f_mode);
 }

 /*
@@ -4238,9 +4317,8 @@ recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple)
  * evaluate_function: try to pre-evaluate a function call
  *
  * We can do this if the function is strict and has any constant-null inputs
- * (just return a null constant), or if the function is immutable and has all
- * constant inputs (call it and return the result as a Const node).  In
- * estimation mode we are willing to pre-evaluate stable functions too.
+ * (just return a null constant), or if the function is safe to evaluate and
+ * has all constant inputs (call it and return the result as a Const node).
  *
  * Returns a simplified expression if successful, or NULL if cannot
  * simplify the function.
@@ -4293,7 +4371,7 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
      * If the function is strict and has a constant-NULL input, it will never
      * be called at all, so we can replace the call by a NULL constant, even
      * if there are other inputs that aren't constant, and even if the
-     * function is not otherwise immutable.
+     * function is not otherwise safe to evaluate.
      */
     if (funcform->proisstrict && has_null_input)
         return (Expr *) makeNullConst(result_type, result_typmod,
@@ -4308,17 +4386,9 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
         return NULL;

     /*
-     * Ordinarily we are only allowed to simplify immutable functions. But for
-     * purposes of estimation, we consider it okay to simplify functions that
-     * are merely stable; the risk that the result might change from planning
-     * time to execution time is worth taking in preference to not being able
-     * to estimate the value at all.
+     * Are we permitted to evaluate this function in the current context?
      */
-    if (funcform->provolatile == PROVOLATILE_IMMUTABLE)
-         /* okay */ ;
-    else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE)
-         /* okay */ ;
-    else
+    if (!ece_function_form_is_safe(funcform, context))
         return NULL;

     /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4b5af32440..5a3fdcc136 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1338,7 +1338,7 @@
   proname => 'int4', prorettype => 'int4', proargtypes => 'int8',
   prosrc => 'int84' },
 { oid => '481', descr => 'convert int4 to int8',
-  proname => 'int8', prorettype => 'int8', proargtypes => 'int4',
+  proname => 'int8', proleakproof => 't', prorettype => 'int8', proargtypes => 'int4',
   prosrc => 'int48' },
 { oid => '482', descr => 'convert int8 to float8',
   proname => 'float8', prorettype => 'float8', proargtypes => 'int8',
diff --git a/src/test/regress/expected/case.out b/src/test/regress/expected/case.out
index c0c8acf035..0b23057ee4 100644
--- a/src/test/regress/expected/case.out
+++ b/src/test/regress/expected/case.out
@@ -93,10 +93,62 @@ SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END;
     1
 (1 row)

--- However we do not currently suppress folding of potentially
--- reachable subexpressions
 SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl;
+ case
+------
+    0
+    0
+    0
+    0
+(4 rows)
+
+-- However, that guarantee doesn't extend into sub-selects
+SELECT CASE WHEN i > 100 THEN (select 1/0) ELSE 0 END FROM case_tbl;
 ERROR:  division by zero
+-- Inline-ing a SQL function shouldn't cause a failure
+CREATE FUNCTION case_func(double precision, integer)
+RETURNS integer
+LANGUAGE SQL AS $$
+   SELECT CASE WHEN $1 = 42 THEN 1/$2 ELSE NULL END
+$$;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT case_func(f, 0) FROM case_tbl;
+                                      QUERY PLAN
+--------------------------------------------------------------------------------------
+ Seq Scan on public.case_tbl
+   Output: CASE WHEN (f = '42'::double precision) THEN (1 / 0) ELSE NULL::integer END
+(2 rows)
+
+SELECT case_func(f, 0) FROM case_tbl;
+ case_func
+-----------
+
+
+
+
+(4 rows)
+
+DROP FUNCTION case_func(double precision, integer);
+-- Subquery flattening shouldn't result in such errors, either
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+                                                QUERY PLAN
+-----------------------------------------------------------------------------------------------------------
+ Seq Scan on public.case_tbl
+   Output: case_tbl.f, CASE WHEN (case_tbl.f = '42'::double precision) THEN (1 / 0) ELSE NULL::integer END
+(2 rows)
+
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+   f   | case
+-------+------
+  10.1 |
+  20.2 |
+ -30.3 |
+       |
+(4 rows)
+
 -- Test for cases involving untyped literals in test expression
 SELECT CASE 'a' WHEN 'a' THEN 1 ELSE 2 END;
  case
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 27056d70d3..7b576b0208 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -634,6 +634,7 @@ int84lt(bigint,integer)
 int84gt(bigint,integer)
 int84le(bigint,integer)
 int84ge(bigint,integer)
+int8(integer)
 oidvectorne(oidvector,oidvector)
 namelt(name,name)
 namele(name,name)
diff --git a/src/test/regress/sql/case.sql b/src/test/regress/sql/case.sql
index 17436c524a..07ee4f8651 100644
--- a/src/test/regress/sql/case.sql
+++ b/src/test/regress/sql/case.sql
@@ -66,11 +66,33 @@ SELECT '7' AS "None",
 -- Constant-expression folding shouldn't evaluate unreachable subexpressions
 SELECT CASE WHEN 1=0 THEN 1/0 WHEN 1=1 THEN 1 ELSE 2/0 END;
 SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END;
-
--- However we do not currently suppress folding of potentially
--- reachable subexpressions
 SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl;

+-- However, that guarantee doesn't extend into sub-selects
+SELECT CASE WHEN i > 100 THEN (select 1/0) ELSE 0 END FROM case_tbl;
+
+-- Inline-ing a SQL function shouldn't cause a failure
+CREATE FUNCTION case_func(double precision, integer)
+RETURNS integer
+LANGUAGE SQL AS $$
+   SELECT CASE WHEN $1 = 42 THEN 1/$2 ELSE NULL END
+$$;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT case_func(f, 0) FROM case_tbl;
+
+SELECT case_func(f, 0) FROM case_tbl;
+
+DROP FUNCTION case_func(double precision, integer);
+
+-- Subquery flattening shouldn't result in such errors, either
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+
 -- Test for cases involving untyped literals in test expression
 SELECT CASE 'a' WHEN 'a' THEN 1 ELSE 2 END;


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Default setting for enable_hashagg_disk
Следующее
От: Dave Cramer
Дата:
Сообщение: Any objections to implementing LogicalDecodeMessageCB for pgoutput?