Обсуждение: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

Поиск
Список
Период
Сортировка

BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17387
Logged by:          James Inform
Email address:      james.inform@pharmapp.de
PostgreSQL version: 14.1
Operating system:   Mac and Linux (Ubuntu)
Description:

While the following sql works under PG13:

with q_data as (
  select '1' as testa, 1 as testb 
  union
  select '2' as testa, 2 as testb 
  union
  select '3' as testa, 3 as testb 
  union
  select '4' as testa, 4 as testb 
)
select array_agg(q) || array_agg(q) from q_data q;

and results in:
 {"(3,3)","(1,1)","(4,4)","(2,2)","(3,3)","(1,1)","(4,4)","(2,2)"}

the same SQL on PG14.1 fails with:

ERROR:  operator is not unique: record[] || record[]
LINE 10: select array_agg(q) || array_agg(q) from q_data q;
                                               ^
HINT:  Could not choose a best candidate operator. You might need to add
explicit type casts.

Let's not discuss if such an sql makes sense.
But question is: Why is PG14 giving an error while PG13 works?

Cheers,
James


Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Pavel Stehule
Дата:
Hi


Let's not discuss if such an sql makes sense.
But question is: Why is PG14 giving an error while PG13 works?

Probably it is side effect of this patch

https://github.com/postgres/postgres/commit/9e38c2bb5093ceb0c04d6315ccd8975bd17add66#diff-e2a931f90073b784e341960c6fe1f48aaea4b5d57eb4388143534eec3863477b

The array_append, array_cat, array_prepend changed input types from any* kind of polymorphic types to anycompatible* kind of polymorphic types

Regards

Pavel Stehule



Cheers,
James

Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Pavel Stehule
Дата:


pá 28. 1. 2022 v 21:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi


Let's not discuss if such an sql makes sense.
But question is: Why is PG14 giving an error while PG13 works?

Probably it is side effect of this patch

https://github.com/postgres/postgres/commit/9e38c2bb5093ceb0c04d6315ccd8975bd17add66#diff-e2a931f90073b784e341960c6fe1f48aaea4b5d57eb4388143534eec3863477b

The array_append, array_cat, array_prepend changed input types from any* kind of polymorphic types to anycompatible* kind of polymorphic types

anycompatible* types are less sensitive to different data types, but it increases a risk of possibility of errors when more than one function can be detected for execution on analysis of function's signatures.


Regards

Pavel Stehule



Cheers,
James

Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> pá 28. 1. 2022 v 21:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
>> Probably it is side effect of this patch
>>
https://github.com/postgres/postgres/commit/9e38c2bb5093ceb0c04d6315ccd8975bd17add66#diff-e2a931f90073b784e341960c6fe1f48aaea4b5d57eb4388143534eec3863477b

> anycompatible* types are less sensitive to different data types, but it
> increases a risk of possibility of errors when more than one function can
> be detected for execution on analysis of function's signatures.

Hmm.  We have

regression=# \do ||
                                                   List of operators
   Schema   | Name |   Left arg type    |   Right arg type   |    Result type     |             Description


------------+------+--------------------+--------------------+--------------------+-------------------------------------
 pg_catalog | ||   | anycompatible      | anycompatiblearray | anycompatiblearray | prepend element onto front of array
 pg_catalog | ||   | anycompatiblearray | anycompatible      | anycompatiblearray | append element onto end of array
 pg_catalog | ||   | anycompatiblearray | anycompatiblearray | anycompatiblearray | concatenate
 ...

where before it was

 pg_catalog | ||   | anyelement    | anyarray       | anyarray    | prepend element onto front of array
 pg_catalog | ||   | anyarray      | anyelement     | anyarray    | append element onto end of array
 pg_catalog | ||   | anyarray      | anyarray       | anyarray    | concatenate

which was non-ambiguous because in this usage, anyelement
wouldn't match an array type.  I wonder why that's not
happening with the anycompatible family?

We could s/anycompatible/anycompatiblenonarray/ in the
catalog entries, but it seems like we shouldn't have to.

            regards, tom lane



Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Tom Lane
Дата:
I wrote:
> ... which was non-ambiguous because in this usage, anyelement
> wouldn't match an array type.  I wonder why that's not
> happening with the anycompatible family?

Poking further, this case still works:

regression=# select array[1] || array[2];
 ?column?
----------
 {1,2}
(1 row)

so we didn't break it completely (I rather imagine we have
regression tests that would have noticed that).  Also,
you can still concatenate arrays of known composite types:

regression=# select array_agg(t) || array_agg(t) from int8_tbl t;

                                                        ?column?


--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------------------------------
 {"(123,456)","(123,4567890123456789)","(4567890123456789,123)","(45678901234567
89,4567890123456789)","(4567890123456789,-4567890123456789)","(123,456)","(123,4
567890123456789)","(4567890123456789,123)","(4567890123456789,4567890123456789)"
,"(4567890123456789,-4567890123456789)"}
(1 row)

So it seems like this is specific to type record[] somehow.
Odd.

            regards, tom lane



Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Tom Lane
Дата:
I wrote:
> So it seems like this is specific to type record[] somehow.

Ah, no, I found it: the callers of select_common_type_from_oids
assume that its result is guaranteed valid, which is not so.
We need to explicitly check can_coerce_type.  This oversight
allows check_generic_type_consistency to succeed when it should
not, which in turn allows us to decide that record and record[]
are OK as matches to all three of those operators.

This apparently escaped notice before because we've only tested
cases in which incompatible arguments were of different typcategory.
record and record[] are both of category 'P', which might be a
dumb idea.  But this would be a bug anyway.

We need something like the attached, but I'm going to nose
around for other oversights.

            regards, tom lane

diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index d674e31405..aa4cfaa52e 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -1298,6 +1298,10 @@ parser_coercion_errposition(ParseState *pstate,
  * rather than throwing an error on failure.
  * 'which_expr': if not NULL, receives a pointer to the particular input
  * expression from which the result type was taken.
+ *
+ * Caution: "failure" just means that there were inputs of different type
+ * categories.  It is not guaranteed that all of the inputs are coercible
+ * to the selected type; caller must check that (see verify_common_type).
  */
 Oid
 select_common_type(ParseState *pstate, List *exprs, const char *context,
@@ -1426,6 +1430,10 @@ select_common_type(ParseState *pstate, List *exprs, const char *context,
  * earlier entries in the array have some preference over later ones.
  * On failure, return InvalidOid if noerror is true, else throw an error.
  *
+ * Caution: "failure" just means that there were inputs of different type
+ * categories.  It is not guaranteed that all of the inputs are coercible
+ * to the selected type; caller must check that (see verify_common_type).
+ *
  * Note: neither caller will pass any UNKNOWNOID entries, so the tests
  * for that in this function are dead code.  However, they don't cost much,
  * and it seems better to keep this logic as close to select_common_type()
@@ -1548,6 +1556,27 @@ coerce_to_common_type(ParseState *pstate, Node *node,
     return node;
 }

+/*
+ * verify_common_type()
+ *        Verify that all input types can be coerced to a proposed common type.
+ *
+ * Most callers of select_common_type() don't need to do this explicitly
+ * because the checks will happen while trying to convert input expressions
+ * to the right type, e.g. in coerce_to_common_type().  However, if a separate
+ * check step is needed to validate the applicability of the common type, call
+ * this.
+ */
+static bool
+verify_common_type(Oid common_type, int nargs, const Oid *typeids)
+{
+    for (int i = 0; i < nargs; i++)
+    {
+        if (!can_coerce_type(1, &typeids[i], &common_type, COERCION_IMPLICIT))
+            return false;
+    }
+    return true;
+}
+
 /*
  * select_common_typmod()
  *        Determine the common typmod of a list of input expressions.
@@ -1917,7 +1946,13 @@ check_generic_type_consistency(const Oid *actual_arg_types,
                                          true);

         if (!OidIsValid(anycompatible_typeid))
-            return false;        /* there's no common supertype */
+            return false;        /* there's definitely no common supertype */
+
+        /* We have to verify that the selected type actually works */
+        if (!verify_common_type(anycompatible_typeid,
+                                n_anycompatible_args,
+                                anycompatible_actual_types))
+            return false;

         if (have_anycompatible_nonarray)
         {
@@ -2494,6 +2529,14 @@ enforce_generic_type_consistency(const Oid *actual_arg_types,
                                              anycompatible_actual_types,
                                              false);

+            /* We have to verify that the selected type actually works */
+            if (!verify_common_type(anycompatible_typeid,
+                                    n_anycompatible_args,
+                                    anycompatible_actual_types))
+                ereport(ERROR,
+                        (errcode(ERRCODE_DATATYPE_MISMATCH),
+                         errmsg("arguments of anycompatible family cannot be cast to a common type")));
+
             if (have_anycompatible_array)
             {
                 anycompatible_array_typeid = get_array_type(anycompatible_typeid);

Re: BUG #17387: Working in PG13 but not in PGH14: array_agg(RECORD)

От
Tom Lane
Дата:
I wrote:
> We need something like the attached, but I'm going to nose
> around for other oversights.

Sure enough, transformAExprIn has a related bug: if all the IN
arguments are of the same typcategory, it will try to stuff them
all into an array, whether or not they're actually coercible
to a common type.  We should fall back to the x = v1 OR x = v2 ...
interpretation when that's not possible.  I failed to come up
with a simple example in which that leads to success; odds are
that there's no suitable = operators either.  But it's probably
possible with some weird set of user-defined types.  In any
case, the intent of the existing code is clearly that this
should happen.

That bug is ancient, more than 10 years old.  Given the lack
of field complaints I'm not terribly worried about back-patching,
but I suppose it can go into v14 along with the regression fix.

            regards, tom lane

diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index d674e31405..c4e958e4aa 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -1298,6 +1298,10 @@ parser_coercion_errposition(ParseState *pstate,
  * rather than throwing an error on failure.
  * 'which_expr': if not NULL, receives a pointer to the particular input
  * expression from which the result type was taken.
+ *
+ * Caution: "failure" just means that there were inputs of different type
+ * categories.  It is not guaranteed that all the inputs are coercible to the
+ * selected type; caller must check that (see verify_common_type).
  */
 Oid
 select_common_type(ParseState *pstate, List *exprs, const char *context,
@@ -1426,6 +1430,10 @@ select_common_type(ParseState *pstate, List *exprs, const char *context,
  * earlier entries in the array have some preference over later ones.
  * On failure, return InvalidOid if noerror is true, else throw an error.
  *
+ * Caution: "failure" just means that there were inputs of different type
+ * categories.  It is not guaranteed that all the inputs are coercible to the
+ * selected type; caller must check that (see verify_common_type_from_oids).
+ *
  * Note: neither caller will pass any UNKNOWNOID entries, so the tests
  * for that in this function are dead code.  However, they don't cost much,
  * and it seems better to keep this logic as close to select_common_type()
@@ -1548,6 +1556,48 @@ coerce_to_common_type(ParseState *pstate, Node *node,
     return node;
 }

+/*
+ * verify_common_type()
+ *        Verify that all input types can be coerced to a proposed common type.
+ *        Return true if so, false if not all coercions are possible.
+ *
+ * Most callers of select_common_type() don't need to do this explicitly
+ * because the checks will happen while trying to convert input expressions
+ * to the right type, e.g. in coerce_to_common_type().  However, if a separate
+ * check step is needed to validate the applicability of the common type, call
+ * this.
+ */
+bool
+verify_common_type(Oid common_type, List *exprs)
+{
+    ListCell   *lc;
+
+    foreach(lc, exprs)
+    {
+        Node       *nexpr = (Node *) lfirst(lc);
+        Oid            ntype = exprType(nexpr);
+
+        if (!can_coerce_type(1, &ntype, &common_type, COERCION_IMPLICIT))
+            return false;
+    }
+    return true;
+}
+
+/*
+ * verify_common_type_from_oids()
+ *        As above, but work from an array of type OIDs.
+ */
+static bool
+verify_common_type_from_oids(Oid common_type, int nargs, const Oid *typeids)
+{
+    for (int i = 0; i < nargs; i++)
+    {
+        if (!can_coerce_type(1, &typeids[i], &common_type, COERCION_IMPLICIT))
+            return false;
+    }
+    return true;
+}
+
 /*
  * select_common_typmod()
  *        Determine the common typmod of a list of input expressions.
@@ -1917,7 +1967,13 @@ check_generic_type_consistency(const Oid *actual_arg_types,
                                          true);

         if (!OidIsValid(anycompatible_typeid))
-            return false;        /* there's no common supertype */
+            return false;        /* there's definitely no common supertype */
+
+        /* We have to verify that the selected type actually works */
+        if (!verify_common_type_from_oids(anycompatible_typeid,
+                                          n_anycompatible_args,
+                                          anycompatible_actual_types))
+            return false;

         if (have_anycompatible_nonarray)
         {
@@ -2494,6 +2550,14 @@ enforce_generic_type_consistency(const Oid *actual_arg_types,
                                              anycompatible_actual_types,
                                              false);

+            /* We have to verify that the selected type actually works */
+            if (!verify_common_type_from_oids(anycompatible_typeid,
+                                              n_anycompatible_args,
+                                              anycompatible_actual_types))
+                ereport(ERROR,
+                        (errcode(ERRCODE_DATATYPE_MISMATCH),
+                         errmsg("arguments of anycompatible family cannot be cast to a common type")));
+
             if (have_anycompatible_array)
             {
                 anycompatible_array_typeid = get_array_type(anycompatible_typeid);
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index c4aaf37727..1c09ea24cd 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1121,6 +1121,11 @@ transformAExprIn(ParseState *pstate, A_Expr *a)
         allexprs = list_concat(list_make1(lexpr), rnonvars);
         scalar_type = select_common_type(pstate, allexprs, NULL, NULL);

+        /* We have to verify that the selected type actually works */
+        if (OidIsValid(scalar_type) &&
+            !verify_common_type(scalar_type, allexprs))
+            scalar_type = InvalidOid;
+
         /*
          * Do we have an array type to use?  Aside from the case where there
          * isn't one, we don't risk using ScalarArrayOpExpr when the common
diff --git a/src/include/parser/parse_coerce.h b/src/include/parser/parse_coerce.h
index fe046a2c03..b105c7da90 100644
--- a/src/include/parser/parse_coerce.h
+++ b/src/include/parser/parse_coerce.h
@@ -70,6 +70,7 @@ extern Oid    select_common_type(ParseState *pstate, List *exprs,
 extern Node *coerce_to_common_type(ParseState *pstate, Node *node,
                                    Oid targetTypeId,
                                    const char *context);
+extern bool verify_common_type(Oid common_type, List *exprs);

 extern int32 select_common_typmod(ParseState *pstate, List *exprs, Oid common_type);

diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 3e3a1beaab..18e4e25bf4 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -747,6 +747,12 @@ SELECT ARRAY[1.1] || ARRAY[2,3,4];
  {1.1,2,3,4}
 (1 row)

+SELECT array_agg(x) || array_agg(x) FROM (VALUES (ROW(1,2)), (ROW(3,4))) v(x);
+             ?column?
+-----------------------------------
+ {"(1,2)","(3,4)","(1,2)","(3,4)"}
+(1 row)
+
 SELECT * FROM array_op_test WHERE i @> '{32}' ORDER BY seqno;
  seqno |                i                |                                                                 t
                                                       

-------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out
index 6406fb3a76..84ab7771dd 100644
--- a/src/test/regress/expected/expressions.out
+++ b/src/test/regress/expected/expressions.out
@@ -232,6 +232,36 @@ explain (verbose, costs off) select * from bpchar_view

 rollback;
 --
+-- Ordinarily, IN/NOT IN can be converted to a ScalarArrayOpExpr
+-- with a suitably-chosen array type.
+--
+explain (verbose, costs off)
+select random() IN (1, 4, 8.0);
+                         QUERY PLAN
+------------------------------------------------------------
+ Result
+   Output: (random() = ANY ('{1,4,8}'::double precision[]))
+(2 rows)
+
+explain (verbose, costs off)
+select random()::int IN (1, 4, 8.0);
+                                QUERY PLAN
+---------------------------------------------------------------------------
+ Result
+   Output: (((random())::integer)::numeric = ANY ('{1,4,8.0}'::numeric[]))
+(2 rows)
+
+-- However, if there's not a common supertype for the IN elements,
+-- we should instead try to produce "x = v1 OR x = v2 OR ...".
+-- In most cases that'll fail for lack of all the requisite = operators,
+-- but it can succeed sometimes.  So this should complain about lack of
+-- an = operator, not about cast failure.
+select '(0,0)'::point in ('(0,0,0,0)'::box, point(0,0));
+ERROR:  operator does not exist: point = box
+LINE 1: select '(0,0)'::point in ('(0,0,0,0)'::box, point(0,0));
+                              ^
+HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
+--
 -- Tests for ScalarArrayOpExpr with a hashfn
 --
 -- create a stable function so that the tests below are not
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 912233ef96..03e56bf924 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -317,6 +317,7 @@ SELECT ARRAY[[1,2],[3,4]] || ARRAY[5,6] AS "{{1,2},{3,4},{5,6}}";
 SELECT ARRAY[0,0] || ARRAY[1,1] || ARRAY[2,2] AS "{0,0,1,1,2,2}";
 SELECT 0 || ARRAY[1,2] || 3 AS "{0,1,2,3}";
 SELECT ARRAY[1.1] || ARRAY[2,3,4];
+SELECT array_agg(x) || array_agg(x) FROM (VALUES (ROW(1,2)), (ROW(3,4))) v(x);

 SELECT * FROM array_op_test WHERE i @> '{32}' ORDER BY seqno;
 SELECT * FROM array_op_test WHERE i && '{32}' ORDER BY seqno;
diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql
index 03b88bde7a..755370f358 100644
--- a/src/test/regress/sql/expressions.sql
+++ b/src/test/regress/sql/expressions.sql
@@ -103,6 +103,22 @@ explain (verbose, costs off) select * from bpchar_view
 rollback;


+--
+-- Ordinarily, IN/NOT IN can be converted to a ScalarArrayOpExpr
+-- with a suitably-chosen array type.
+--
+explain (verbose, costs off)
+select random() IN (1, 4, 8.0);
+explain (verbose, costs off)
+select random()::int IN (1, 4, 8.0);
+-- However, if there's not a common supertype for the IN elements,
+-- we should instead try to produce "x = v1 OR x = v2 OR ...".
+-- In most cases that'll fail for lack of all the requisite = operators,
+-- but it can succeed sometimes.  So this should complain about lack of
+-- an = operator, not about cast failure.
+select '(0,0)'::point in ('(0,0,0,0)'::box, point(0,0));
+
+
 --
 -- Tests for ScalarArrayOpExpr with a hashfn
 --