Обсуждение: Questioning an errcode and message in jsonb.c

Поиск
Список
Период
Сортировка

Questioning an errcode and message in jsonb.c

От
Chapman Flack
Дата:
Hi,

This came up in [0] and opinions besides my own would be welcome.

There is a function cannotCastJsonbValue in jsonb.c, and it throws 
errors
of this form:

ERRCODE_INVALID_PARAMETER_VALUE "cannot cast jsonb %1$s to type %2$s"

where %1 is one of the possible JsonbValue types (null, string, numeric,
boolean, array, object, or "array or object" for jbvBinary). %2 is the 
name
of some SQL type.

I question the errcode because I do not see a lot of precedent for
ERRCODE_INVALID_PARAMETER_VALUE in this sort of context; it seems more
often used for a weird value of some behavioral parameter passed to
a function.

The bigger deal is I question the wording, because although calls to
this function are made from various jsonb_foo cast functions, the
conditions for calling it don't involve the SQL type foo. This message
only means that you don't have the type of JsonbValue you thought
you were going to cast to the SQL type. I think that's what it should 
say.

Let me lay out a little more of the picture, by contrasting the way 
these
jsonb casts work (which may be as specified in SQL/JSON, I don't have a
copy) with the way XMLCAST works in SQL/XML.

When you XMLCAST some XML value to some target SQL type TD, then there
is a corresponding XML Schema type XMLT chosen based on TD. For example,
if you are casting to SQL's SMALLINT, XMLT will be chosen as xs:integer.
There are then two things that happen in sequence:
1) whatever XML type you have is hit with the XQuery expression
    "cast as xs:integer", and then
2) the xs:integer is cast to SQL's SMALLINT and returned.

What our jsonb_foo casts do starts out the same way: based on
the target SQL type, there's a corresponding JsonbValue type
chosen. Target SQL type SMALLINT => jbvNumeric, for example.

But step 2 is not like the SQL/XML case: there is no attempt
to cast any other kind of JsonbValue to jbvNumeric. If the value
isn't already of that JSON type, it's an error. (It's like an
alternate-universe version of the SQL/XML rules, where the
XQuery "cast as" in step 1 is "treat as" instead.)

And then step 3 is unchanged: the JsonbValue of the expected
type (which it had to already be) is cast to the wanted SQL
type.

Consider these two examples:

select '"32768"'::jsonb::smallint;
INVALID_PARAMETER_VALUE cannot cast jsonb string to type smallint

select '32768'::jsonb::smallint;
NUMERIC_VALUE_OUT_OF_RANGE smallint out of range

The second message is clearly from step 3, the actual attempt
to cast a value to smallint, and is what you would expect.

The first message is from step 2, and it really only means
"jsonb string where jsonb numeric expected", but for whatever SQL
type you ask for that corresponds to jsonb numeric in step 2,
you get a custom version of the message phrased as "can't cast to"
your target SQL type instead. To me, that just disguises what is
really happening. (It's not a matter of "can't" cast "32768" to
32768, after all; it's a matter of "won't" do any casting in
step 2.)

It matters because the patch being discussed in [0] is
complexified by trying to produce a matching message; it
actually requires passing the ultimate wanted SQL type as an
extra argument to a function that has no other reason to
need it, and could easily produce a message like "jsonb string
where jsonb numeric expected" without it.

To me, when a situation like that crops up, it suggests that the
message is kind of misrepresenting the logic.

It would make me happy if the message could be changed, and maybe
ERRCODE_INVALID_PARAMETER_VALUE also changed, perhaps to one of
the JSON-specific ones in the 2203x range.

By the same token, the message and the errcode are established
current behavior, so there can be sound arguments against changing
them (even though that means weird logic in rewriting the expression).

Thoughts?

Regards,
-Chap

[0] 
https://www.postgresql.org/message-id/43a988594ac91a63dc4bb49a94303a42%40anastigmatix.net



Re: Questioning an errcode and message in jsonb.c

От
Andy Fan
Дата:
Hi, 

 Thanks for raising this issue in a more public way:) 

On Tue, Sep 19, 2023 at 12:55 AM Chapman Flack <chap@anastigmatix.net> wrote:

It would make me happy if the message could be changed, and maybe
ERRCODE_INVALID_PARAMETER_VALUE also changed, perhaps to one of
the JSON-specific ones in the 2203x range.

I'd agree with this. 
 
By the same token, the message and the errcode are established
current behavior, so there can be sound arguments against changing
them (even though that means weird logic in rewriting the expression).
 
This is not a technology issue,  I'd be pretty willing to see what some
more experienced people say about this.  I think just documenting the 
impatible behavior is an option as well.  

--
Best Regards
Andy Fan

Re: Questioning an errcode and message in jsonb.c

От
Peter Eisentraut
Дата:
On 18.09.23 18:55, Chapman Flack wrote:
> It would make me happy if the message could be changed, and maybe
> ERRCODE_INVALID_PARAMETER_VALUE also changed, perhaps to one of
> the JSON-specific ones in the 2203x range.

What is an example of a statement or function call that causes this 
error?  Then we can look in the SQL standard for guidance.



Re: Questioning an errcode and message in jsonb.c

От
Andy Fan
Дата:
Hi Peter,

On Wed, Sep 20, 2023 at 4:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 18.09.23 18:55, Chapman Flack wrote:
> It would make me happy if the message could be changed, and maybe
> ERRCODE_INVALID_PARAMETER_VALUE also changed, perhaps to one of
> the JSON-specific ones in the 2203x range.

What is an example of a statement or function call that causes this
error?  Then we can look in the SQL standard for guidance.
 
Thanks for showing interest in this.   The issue comes from this situation.

create table tb(a jsonb);

insert into tb select '{"a": "foo", "b": 100000000}';


select cast(a->'a' as numeric) from tb;

ERRCODE_INVALID_PARAMETER_VALUE  cannot cast jsonb string to type numeric

the call stack is:
0  in errstart of elog.c:351
1  in errstart_cold of elog.c:333
2  in cannotCastJsonbValue of jsonb.c:2033
3  in jsonb_numeric of jsonb.c:2063
4  in ExecInterpExpr of execExprInterp.c:758

select cast(a->'b' as int2) from tb;
NUMERIC_VALUE_OUT_OF_RANGE smallint out of range

the call stack is:
1  in errstart_cold of elog.c:333
2  in numeric_int2 of numeric.c:4503
3  in DirectFunctionCall1Coll of fmgr.c:785
4  in jsonb_int2 of jsonb.c:2086

There are 2 different errcode involved here and there are two different
functions that play part in it (jsonb_numeric and numeric_int2).  and
the error code jsonb_numeric used is improper as well. 

The difference is not very huge, but it would be cool if we can make 
it better,  If something really improves here, it will make the code in [0]
cleaner as well. the bad code in [0]:

+Datum
+jsonb_finish_numeric(PG_FUNCTION_ARGS)
+{
+ JsonbValue *v = (JsonbValue *)PG_GETARG_POINTER(0);
+ Oid final_oid = PG_GETARG_OID(1);
+ if (v->type != jbvNumeric)
+ cannotCastJsonbValue(v->type, format_type_be(final_oid));
+ PG_RETURN_NUMERIC(v->val.numeric);
+}

To match the error message in the older version, I have to input
a {final_oid} argument in jsonb_finish_numeric function which
is not good. 

As to how to redesign the error message is a bit confusing to
me, it would be good to see the proposal code as well. 

The only concern from me is that the new error from newer
version is not compatible with the older versions, which may matters
matters or doesn't match, I don't know. 

--
Best Regards
Andy Fan

Re: Questioning an errcode and message in jsonb.c

От
Chapman Flack
Дата:
On 2023-09-21 20:38, Andy Fan wrote:
> insert into tb select '{"a": "foo", "b": 100000000}';
> ...
> select cast(a->'a' as numeric) from tb;
> ERRCODE_INVALID_PARAMETER_VALUE  cannot cast jsonb string to type 
> numeric
> ...
> select cast(a->'b' as int2) from tb;
> NUMERIC_VALUE_OUT_OF_RANGE smallint out of range

... and perhaps driving home the point:

insert into tb select '{"a": "1", "b": 100000000}';
select cast(a->'a' as int2) from tb;
ERRCODE_INVALID_PARAMETER_VALUE  cannot cast jsonb string to type 
smallint

which illustrates that:

1) it is of no consequence whether the non-numeric JSON type of
the cast source is something that does or doesn't look castable to
numeric: in the first-step test that produces this message, the
only thing tested is whether the JSON type of the source is JSON
numeric. If it is not, there will be no attempt to cast it.

2) it is immaterial what the SQL target type of the cast is;
the message will misleadingly say "to smallint" if you are
casting to smallint, or "to double precision" if you are casting
to that, but the only thing that has been tested is whether the
source has JSON type numeric.

The message in this case only really means "JSON type string
where JSON type numeric needed".

The issue is fully general:

insert into tb select '{"a": 1}';
select cast(a->'a' as boolean) from tb;
ERRCODE_INVALID_PARAMETER_VALUE  cannot cast jsonb numeric to type 
boolean

Again, all that has been tested is whether the JSON type is
JSON boolean. If it is not, no effort is made to cast it, and
the message really only means "JSON type numeric where
JSON type boolean needed".

The most annoying cases are the ones where JSON type numeric
is needed, because of the several different SQL types that one
might want as the ultimate target type, so extra machinations
are needed to get this message to misleadingly mention that
ultimate type.

As I mentioned in my earlier message, the behavior here
differs from the exactly analogous specified behavior for
XMLCAST in SQL/XML. I am not saying the behavior here is
wrong; perhaps SQL/JSON has chosen to specify it differently
(I haven't got a copy). But I pointed out the difference as
it may help to pinpoint the relevant part of the spec.

In the SQL/XML XMLCAST, the same two-step process exists:
a first step that is only concerned with the XML Schema
type (say, is it xs:string or xs:decimal?), and a second
step where the right xs type is then cast to the wanted SQL type.

The difference is, XMLCAST in the first step will try to
cast a different xs type to the right xs type. By contrast
our JSON casting simply requires the JSON type to be the
right JSON type, or fails. And for all I know, that different
approach may be as specified in SQL/JSON.

But I would not have it use ERRCODE_INVALID_PARAMETER_VALUE,
or issue a message talking about the ultimate SQL type when the
only thing checked in that step is the JSON type ... unless
the spec really says to do so.

Regards,
-Chap



Re: Questioning an errcode and message in jsonb.c

От
Andy Fan
Дата:
Hi Chap,
 
As to how to redesign the error message is a bit confusing to
me, it would be good to see the proposal code as well. 

The only concern from me is that the new error from newer
version is not compatible with the older versions, which may matters
matters or doesn't match, I don't know. 


Do you mind providing the patch in your mind, and let's just ignore
the compatible issue for now.  I think that would be pretty helpful for
further discussion. 

--
Best Regards
Andy Fan

Re: Questioning an errcode and message in jsonb.c

От
Peter Eisentraut
Дата:
On 22.09.23 02:38, Andy Fan wrote:
> create table tb(a jsonb);
> 
> insert into tb select '{"a": "foo", "b": 100000000}';
> 
> 
> select cast(a->'a' as numeric) from tb;
> 
> ERRCODE_INVALID_PARAMETER_VALUE  cannot cast jsonb string to type numeric
> 
> the call stack is:
> 0  in errstart of elog.c:351
> 1  in errstart_cold of elog.c:333
> 2  in cannotCastJsonbValue of jsonb.c:2033
> 3  in jsonb_numeric of jsonb.c:2063
> 4  in ExecInterpExpr of execExprInterp.c:758
> 
> select cast(a->'b' as int2) from tb;
> NUMERIC_VALUE_OUT_OF_RANGE smallint out of range
> 
> the call stack is:
> 1  in errstart_cold of elog.c:333
> 2  in numeric_int2 of numeric.c:4503
> 3  in DirectFunctionCall1Coll of fmgr.c:785
> 4  in jsonb_int2 of jsonb.c:2086
> 
> There are 2 different errcode involved here and there are two different
> functions that play part in it (jsonb_numeric and numeric_int2).  and
> the error code jsonb_numeric used is improper as well.

This looks like an undesirable inconsistency.

You asked about the SQL standard.  The error code 
NUMERIC_VALUE_OUT_OF_RANGE appears as part of a failure of the <cast 
specification>.  The error code ERRCODE_INVALID_PARAMETER_VALUE appears 
only as part of processing host parameters in <externally-invoked 
procedure>.  Of course, in PostgreSQL, function calls and casts are 
related under the hood, so you could maybe make arguments for both.  But 
I think we already use ERRCODE_INVALID_PARAMETER_VALUE more broadly than 
the standard, so I would tend to prefer going in the direction of 
NUMERIC_VALUE_OUT_OF_RANGE when in doubt.

We could also consider these operators a special case of JSON_VALUE, in 
which case the following would apply:

"""
If IDT cannot be cast to target type DT according to the Syntax Rules of 
Subclause 6.13, “<cast specification>”, then let TEMPST be data 
exception — SQL/JSON item cannot be cast to target type (2203G).
"""

We do have a definition of this in errcodes.txt but don't use it 
anywhere.  Maybe the patches for SQL/JSON currently being reviewed will 
use it.