Обсуждение: jsonpath versus NaN

Поиск
Список
Период
Сортировка

jsonpath versus NaN

От
Tom Lane
Дата:
Commit 72b646033 inserted this into convertJsonbScalar:

             break;

         case jbvNumeric:
+            /* replace numeric NaN with string "NaN" */
+            if (numeric_is_nan(scalarVal->val.numeric))
+            {
+                appendToBuffer(buffer, "NaN", 3);
+                *jentry = 3;
+                break;
+            }
+
             numlen = VARSIZE_ANY(scalarVal->val.numeric);
             padlen = padBufferToInt(buffer);

To characterize this as hack, slash, and burn programming would be
charitable.  It is entirely clear from the code, the documentation,
and the relevant RFCs that JSONB does not allow NaNs as numeric
values.  So it should be impossible for this code to do anything.

I tried taking it out, and found that this test case from the same
commit fails:

+select jsonb_path_query('"nan"', '$.double()');
+ jsonb_path_query
+------------------
+ "NaN"
+(1 row)

However, seeing that the JSON RFC disallows NaNs, I do not understand
why it's important to accept this.  The adjacent test case showing
that 'inf' isn't accepted:

+select jsonb_path_query('"inf"', '$.double()');
+ERROR:  non-numeric SQL/JSON item
+DETAIL:  jsonpath item method .double() can only be applied to a numeric value

seems like a saner approach.

In short, I think we should rip out the above code snippet and adjust
executeItemOptUnwrapTarget, at about line jsonpath_exec.c:1076 as of HEAD,
to reject NaNs the same way it already rejects infinities.  Can you
explain why it was done like this?

(The reason I came across this was that I'm working on extending
type numeric to allow infinities, and it was not clear what to
do here.  But allowing a jsonb to contain a numeric NaN, even
transiently, seems like a completely horrid idea.)

            regards, tom lane



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
Hi Tom,

Thank you for raising this issue.

On Thu, Jun 11, 2020 at 3:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Commit 72b646033 inserted this into convertJsonbScalar:
>
>                         break;
>
>                 case jbvNumeric:
> +                       /* replace numeric NaN with string "NaN" */
> +                       if (numeric_is_nan(scalarVal->val.numeric))
> +                       {
> +                               appendToBuffer(buffer, "NaN", 3);
> +                               *jentry = 3;
> +                               break;
> +                       }
> +
>                         numlen = VARSIZE_ANY(scalarVal->val.numeric);
>                         padlen = padBufferToInt(buffer);
>
> To characterize this as hack, slash, and burn programming would be
> charitable.  It is entirely clear from the code, the documentation,
> and the relevant RFCs that JSONB does not allow NaNs as numeric
> values.

The JSONB itself doesn't store number NaNs.  It stores the string "NaN".

I found the relevant part of the standard.  Unfortunately, I can't
post the full standard here due to its license, but I think I can cite
the relevant part.

1) If JM specifies double, then For all j, 1 (one) ≤ j ≤ n,
Case:
a) If Ij is not a number or character string, then let ST be data
exception — non-numeric SQL/JSON item.
b) Otherwise, let X be an SQL variable whose value is Ij. Let Vj be
the result of
CAST (X AS DOUBLE PRECISION)
If this conversion results in an exception condition, then let ST be
that exception condition.

So, when we apply the .double() method to string, then the result
should be the same as if we cast string to double in SQL.  In SQL we
convert string 'NaN' to numeric NaN.  So, standard requires us to do
the same in SQL/JSON.

I didn't find yet what the standard says about serializing NaNs back
to JSON.  I'll keep you posted.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Thu, Jun 11, 2020 at 3:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It is entirely clear from the code, the documentation,
>> and the relevant RFCs that JSONB does not allow NaNs as numeric
>> values.

> The JSONB itself doesn't store number NaNs.  It stores the string "NaN".

Yeah, but you have a numeric NaN within the JsonbValue tree between
executeItemOptUnwrapTarget and convertJsonbScalar.  Who's to say that
that illegal-per-the-data-type structure won't escape to somewhere else?
Or perhaps more likely, that we'll need additional warts in other random
places in the JSON code to keep from spitting up on the transiently
invalid structure.

> I found the relevant part of the standard.  Unfortunately, I can't
> post the full standard here due to its license, but I think I can cite
> the relevant part.

I don't think this is very relevant.  The SQL standard has not got the
concepts of Inf or NaN either (see 4.4.2 Characteristics of numbers),
therefore their definition is only envisioning that a string representing
a normal finite number should be castable to DOUBLE PRECISION.  Thus,
both of the relevant standards think that "numbers" are just finite
numbers.

So when neither JSON nor SQL consider that "NaN" is an allowed sort
of number, why are you doing violence to the code to allow it in a
jsonpath?  And if you insist on doing such violence, why didn't you
do some more and kluge it to the point where "Inf" would work too?
(It would require slightly less klugery in the wake of the infinities-
in-numeric patch that I'm going to post soon ... but that doesn't make
it a good idea.)

            regards, tom lane



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jun 11, 2020 at 10:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > On Thu, Jun 11, 2020 at 3:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It is entirely clear from the code, the documentation,
> >> and the relevant RFCs that JSONB does not allow NaNs as numeric
> >> values.
>
> > The JSONB itself doesn't store number NaNs.  It stores the string "NaN".
>
> Yeah, but you have a numeric NaN within the JsonbValue tree between
> executeItemOptUnwrapTarget and convertJsonbScalar.  Who's to say that
> that illegal-per-the-data-type structure won't escape to somewhere else?
> Or perhaps more likely, that we'll need additional warts in other random
> places in the JSON code to keep from spitting up on the transiently
> invalid structure.


I would propose to split two things: user-visible behavior and
internal implementation.  Internal implementation, which allows
numeric NaN within the JsonbValue, isn't perfect and we could improve
it.  But I'd like to determine desired user-visible behavior first,
then we can decide how to fix the implementation.

>
> > I found the relevant part of the standard.  Unfortunately, I can't
> > post the full standard here due to its license, but I think I can cite
> > the relevant part.
>
> I don't think this is very relevant.  The SQL standard has not got the
> concepts of Inf or NaN either (see 4.4.2 Characteristics of numbers),
> therefore their definition is only envisioning that a string representing
> a normal finite number should be castable to DOUBLE PRECISION.  Thus,
> both of the relevant standards think that "numbers" are just finite
> numbers.
>
> So when neither JSON nor SQL consider that "NaN" is an allowed sort
> of number, why are you doing violence to the code to allow it in a
> jsonpath?

Yes, I see.  No standard insists we should support NaN.  However,
standard claims .double() should behave the same as CAST to double.
So, I think if CAST supports NaN, but .double() doesn't, it's still a
violation.

> And if you insist on doing such violence, why didn't you
> do some more and kluge it to the point where "Inf" would work too?


Yep, according to standard .double() should support "Inf" as soon as
CAST to double does.  The reason why it wasn't implemented is that we
use numeric as the internal storage for all the numbers. And numeric
doesn't support Inf yet.

> (It would require slightly less klugery in the wake of the infinities-
> in-numeric patch that I'm going to post soon ... but that doesn't make
> it a good idea.)


If numerics would support infinites, we can follow standard and make
.double() method work the same way as CAST to double does.  Now, I get
that there is no much reason to keep current behaviour, which supports
Nan, but doesn't support Inf.  I think we should either support both
NaN and Inf and don't support any of them.  The latter is a violation
of the standard, but provides us with a simpler and cleaner
implementation.  What do you think?

BTW, we found what the standard says about serialization of SQL/JSON items.

9.37 Serializing an SQL/JSON item (page 695)
ii) Let JV be an implementation-dependent value of type TT and
encoding ENC such that these two conditions hold:
1) JV is a JSON text.
2) When applying the General Rules of Subclause 9.36, “Parsing JSON
text” with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE
KEYS as UNIQUENESS CONSTRAINT, the returned STATUS is successful
completion and the returned SQL/JSON ITEM is an SQL/JSON item that is
equivalent to SJI.
If there is no such JV, then let ST be the exception condition: data
exception — invalid JSON text.

Basically it says that the resulting text should result in the same
SQL/JSON item when parsed.  I think this literally means that
serialization of numeric NaN is impossible as soon as it's impossible
to get numeric NaN as the result json parsing.  However, in the same
way this would mean that serialization of datetime is also impossible,
but that seems like nonsense.  So, I think this paragraph of the
standard is ill-conceived.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Thu, Jun 11, 2020 at 10:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't think this is very relevant.  The SQL standard has not got the
>> concepts of Inf or NaN either (see 4.4.2 Characteristics of numbers),
>> therefore their definition is only envisioning that a string representing
>> a normal finite number should be castable to DOUBLE PRECISION.  Thus,
>> both of the relevant standards think that "numbers" are just finite
>> numbers.

> Yes, I see.  No standard insists we should support NaN.  However,
> standard claims .double() should behave the same as CAST to double.
> So, I think if CAST supports NaN, but .double() doesn't, it's still a
> violation.

No, I think you are completely misunderstanding the standard.  They
are saying that strings that look like legal numbers according to SQL
should be castable into numbers.  But NaN and Inf are not legal
numbers according to SQL, so there is nothing in that text that
justifies accepting "NaN".  Nor does the JSON standard provide any
support for that position.  So I think it is fine to leave NaN/Inf
out of the world of what you can write in jsonpath.

I'd be more willing to let the code do this if it didn't require such
a horrid, dangerous kluge to do so.  But it does, and I don't see any
easy way around that, so I think we should just take out the kluge.
And do so sooner not later, before some misguided user starts to
depend on it.

            regards, tom lane



Re: jsonpath versus NaN

От
Oleg Bartunov
Дата:


On Wed, Jun 17, 2020 at 6:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Thu, Jun 11, 2020 at 10:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't think this is very relevant.  The SQL standard has not got the
>> concepts of Inf or NaN either (see 4.4.2 Characteristics of numbers),
>> therefore their definition is only envisioning that a string representing
>> a normal finite number should be castable to DOUBLE PRECISION.  Thus,
>> both of the relevant standards think that "numbers" are just finite
>> numbers.

> Yes, I see.  No standard insists we should support NaN.  However,
> standard claims .double() should behave the same as CAST to double.
> So, I think if CAST supports NaN, but .double() doesn't, it's still a
> violation.

No, I think you are completely misunderstanding the standard.  They
are saying that strings that look like legal numbers according to SQL
should be castable into numbers.  But NaN and Inf are not legal
numbers according to SQL, so there is nothing in that text that
justifies accepting "NaN".  Nor does the JSON standard provide any
support for that position.  So I think it is fine to leave NaN/Inf
out of the world of what you can write in jsonpath.
 
rfc and sql json forbid Nan and Inf ( Technical Report is freely available,

Page 10 JSON terminology.
“A sequence comprising an integer part, optionally followed by a fractional part and/or
an exponent part (non-numeric values, such as infinity and NaN are not permitted)”
 

I'd be more willing to let the code do this if it didn't require such
a horrid, dangerous kluge to do so.  But it does, and I don't see any
easy way around that, so I think we should just take out the kluge.
And do so sooner not later, before some misguided user starts to
depend on it.

The problem is that we tried to find a trade-off  between standard and postgres
implementation, for example, in postgres CAST  allows NaN and Inf, and SQL Standard
requires .double should works as CAST.

 SELECT 'nan'::real, 'inf'::real;
 float4 |  float4
--------+----------
    NaN | Infinity
(1 row)

 

                        regards, tom lane




--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath versus NaN

От
Tom Lane
Дата:
Oleg Bartunov <obartunov@postgrespro.ru> writes:
> The problem is that we tried to find a trade-off between standard and
> postgres implementation, for example, in postgres CAST allows NaN and
> Inf, and SQL Standard requires .double should works as CAST.

As I said, I think this is a fundamental misreading of the standard.
The way I read it is that it requires the set of values that are legal
according to the standard to be processed the same way as CAST would.

While we certainly *could* choose to extend jsonpath, and/or jsonb
itself, to allow NaN/Inf, I do not think that it's sane to argue that
the standard requires us to do that; the wording in the opposite
direction is pretty clear.  Also, I do not find it convincing to
extend jsonpath that way when we haven't extended jsonb.  Quite aside
from the ensuing code warts, what in the world is the use-case?

            regards, tom lane



Re: jsonpath versus NaN

От
Robert Haas
Дата:
On Thu, Jun 18, 2020 at 11:51 AM Oleg Bartunov <obartunov@postgrespro.ru> wrote:
> The problem is that we tried to find a trade-off  between standard and postgres
> implementation, for example, in postgres CAST  allows NaN and Inf, and SQL Standard
> requires .double should works as CAST.

It seems like the right thing is to implement the standard, not to
implement whatever PostgreSQL happens to do in other cases. I can't
help feeling like re-using the numeric data type for other things has
led to this confusion. I think that fails in other cases, too: like
what if you have a super-long integer that can't be represented as a
numeric? I bet jsonb will fail, or maybe it will convert it to a
string, but I don't see how it can do anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
Tom,

On Thu, Jun 18, 2020 at 7:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Oleg Bartunov <obartunov@postgrespro.ru> writes:
> > The problem is that we tried to find a trade-off between standard and
> > postgres implementation, for example, in postgres CAST allows NaN and
> > Inf, and SQL Standard requires .double should works as CAST.
>
> As I said, I think this is a fundamental misreading of the standard.
> The way I read it is that it requires the set of values that are legal
> according to the standard to be processed the same way as CAST would.

Thank you for your answer. I'm trying to understand your point.
Standard claims that .double() method should behave the same way as
CAST to double.  However, standard references the standard behavior of
CAST here, not behavior of your implementation of CAST.  So, if we
extend the functionality of standard CAST in our implementation, that
doesn't automatically mean we should extend the .double() jsonpath
method in the same way.  Is it correct?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jun 18, 2020 at 11:51 AM Oleg Bartunov <obartunov@postgrespro.ru> wrote:
>> The problem is that we tried to find a trade-off  between standard and postgres
>> implementation, for example, in postgres CAST  allows NaN and Inf, and SQL Standard
>> requires .double should works as CAST.

> It seems like the right thing is to implement the standard, not to
> implement whatever PostgreSQL happens to do in other cases. I can't
> help feeling like re-using the numeric data type for other things has
> led to this confusion. I think that fails in other cases, too: like
> what if you have a super-long integer that can't be represented as a
> numeric? I bet jsonb will fail, or maybe it will convert it to a
> string, but I don't see how it can do anything else.

Actually, the JSON spec explicitly says that any number that doesn't fit
in an IEEE double isn't portable [1].  So we're already very far above and
beyond the spec's requirements by using numeric.  We don't need to improve
on that.  But I concur with your point that just because PG does X in
some other cases doesn't mean that we must do X in json or jsonpath.

            regards, tom lane

[1] https://tools.ietf.org/html/rfc7159#page-6

   This specification allows implementations to set limits on the range
   and precision of numbers accepted.  Since software that implements
   IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
   generally available and widely used, good interoperability can be
   achieved by implementations that expect no more precision or range
   than these provide, in the sense that implementations will
   approximate JSON numbers within the expected precision.  A JSON
   number such as 1E400 or 3.141592653589793238462643383279 may indicate
   potential interoperability problems, since it suggests that the
   software that created it expects receiving software to have greater
   capabilities for numeric magnitude and precision than is widely
   available.

   Note that when such software is used, numbers that are integers and
   are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
   sense that implementations will agree exactly on their numeric
   values.



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jun 18, 2020 at 7:34 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Thank you for your answer. I'm trying to understand your point.
> Standard claims that .double() method should behave the same way as
> CAST to double.  However, standard references the standard behavior of
> CAST here, not behavior of your implementation of CAST.

Typo here: please read "our implementation of CAST" here.

> So, if we
> extend the functionality of standard CAST in our implementation, that
> doesn't automatically mean we should extend the .double() jsonpath
> method in the same way.  Is it correct?


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Thank you for your answer. I'm trying to understand your point.
> Standard claims that .double() method should behave the same way as
> CAST to double.  However, standard references the standard behavior of
> CAST here, not behavior of your implementation of CAST.  So, if we
> extend the functionality of standard CAST in our implementation, that
> doesn't automatically mean we should extend the .double() jsonpath
> method in the same way.  Is it correct?

Right.  We could, if we chose, extend jsonpath to allow Inf/NaN, but
I don't believe there's an argument that the spec requires us to.

Also the larger point is that it doesn't make sense to extend jsonpath
that way when we haven't extended json(b) that way.  This code wart
wouldn't exist were it not for that inconsistency.  Also, I find it hard
to see why anyone would have a use for NaN in a jsonpath test when they
can't write NaN in the input json data, nor have it be correctly reflected
into output json data either.

Maybe there's a case for extending json(b) that way; it wouldn't be so
different from the work I'm doing nearby to extend type numeric for
infinities.  But we'd have to have a conversation about whether
interoperability with other JSON implementations is worth sacrificing
to improve consistency with our float and numeric datatypes.  In the
meantime, though, we aren't allowing Inf/NaN in json(b) so I don't think
jsonpath should accept them either.

            regards, tom lane



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jun 18, 2020 at 7:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > Thank you for your answer. I'm trying to understand your point.
> > Standard claims that .double() method should behave the same way as
> > CAST to double.  However, standard references the standard behavior of
> > CAST here, not behavior of your implementation of CAST.  So, if we
> > extend the functionality of standard CAST in our implementation, that
> > doesn't automatically mean we should extend the .double() jsonpath
> > method in the same way.  Is it correct?
>
> Right.  We could, if we chose, extend jsonpath to allow Inf/NaN, but
> I don't believe there's an argument that the spec requires us to.
>
> Also the larger point is that it doesn't make sense to extend jsonpath
> that way when we haven't extended json(b) that way.  This code wart
> wouldn't exist were it not for that inconsistency.  Also, I find it hard
> to see why anyone would have a use for NaN in a jsonpath test when they
> can't write NaN in the input json data, nor have it be correctly reflected
> into output json data either.

Ok, I got the point.  I have nothing against removing support of NaN
in jsonpath as far as it doesn't violates the standard.  I'm going to
write the patch for this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath versus NaN

От
Andrew Dunstan
Дата:
On 6/18/20 12:35 PM, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Jun 18, 2020 at 11:51 AM Oleg Bartunov <obartunov@postgrespro.ru> wrote:
>>> The problem is that we tried to find a trade-off  between standard and postgres
>>> implementation, for example, in postgres CAST  allows NaN and Inf, and SQL Standard
>>> requires .double should works as CAST.
>> It seems like the right thing is to implement the standard, not to
>> implement whatever PostgreSQL happens to do in other cases. I can't
>> help feeling like re-using the numeric data type for other things has
>> led to this confusion. I think that fails in other cases, too: like
>> what if you have a super-long integer that can't be represented as a
>> numeric? I bet jsonb will fail, or maybe it will convert it to a
>> string, but I don't see how it can do anything else.
> Actually, the JSON spec explicitly says that any number that doesn't fit
> in an IEEE double isn't portable [1].  So we're already very far above and
> beyond the spec's requirements by using numeric.  We don't need to improve
> on that.  But I concur with your point that just because PG does X in
> some other cases doesn't mean that we must do X in json or jsonpath.
>
>             regards, tom lane
>
> [1] https://tools.ietf.org/html/rfc7159#page-6
>
>    This specification allows implementations to set limits on the range
>    and precision of numbers accepted.  Since software that implements
>    IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
>    generally available and widely used, good interoperability can be
>    achieved by implementations that expect no more precision or range
>    than these provide, in the sense that implementations will
>    approximate JSON numbers within the expected precision.  A JSON
>    number such as 1E400 or 3.141592653589793238462643383279 may indicate
>    potential interoperability problems, since it suggests that the
>    software that created it expects receiving software to have greater
>    capabilities for numeric magnitude and precision than is widely
>    available.
>
>    Note that when such software is used, numbers that are integers and
>    are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
>    sense that implementations will agree exactly on their numeric
>    values.
>


Just to complete the historical record, that standard wasn't published
at the time we created the JSON type, and the then existing standard
(rfc4627) contains no such statement. We felt it was important to be
able to represent any Postgres data value in as natural a manner as
possible given the constraints of JSON. rfc7159 was published just as we
were finalizing 9.4 with JSONB, although I'm not sure it made a heavy
impact on our consciousness. If it had I would still not have wanted to
impose any additional limitation on numerics. If you want portable
numbers cast the numeric to double before producing the JSON.

ISTR having a conversation about the extended use of jsonb in jsonpath a
while back, although I don't remember if that was on or off list. I know
it troubled me some.


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jun 18, 2020 at 8:04 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Thu, Jun 18, 2020 at 7:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > > Thank you for your answer. I'm trying to understand your point.
> > > Standard claims that .double() method should behave the same way as
> > > CAST to double.  However, standard references the standard behavior of
> > > CAST here, not behavior of your implementation of CAST.  So, if we
> > > extend the functionality of standard CAST in our implementation, that
> > > doesn't automatically mean we should extend the .double() jsonpath
> > > method in the same way.  Is it correct?
> >
> > Right.  We could, if we chose, extend jsonpath to allow Inf/NaN, but
> > I don't believe there's an argument that the spec requires us to.
> >
> > Also the larger point is that it doesn't make sense to extend jsonpath
> > that way when we haven't extended json(b) that way.  This code wart
> > wouldn't exist were it not for that inconsistency.  Also, I find it hard
> > to see why anyone would have a use for NaN in a jsonpath test when they
> > can't write NaN in the input json data, nor have it be correctly reflected
> > into output json data either.
>
> Ok, I got the point.  I have nothing against removing support of NaN
> in jsonpath as far as it doesn't violates the standard.  I'm going to
> write the patch for this.

The patchset is attached, sorry for the delay.

The first patch improves error messages, which appears to be unclear
for me.  If one applies .double() method to a numeric value, we
restrict that this numeric value should fit to double precision type.
If it doesn't fit, the current error message just says the following.

ERROR: jsonpath item method .double() can only be applied to a numeric value

But that's confusing, because .double() method is naturally applied to
a numeric value.  Patch makes this message explicitly report that
numeric value is out of range for double type.  This patch also adds
test exercising this error.  When string can't be converted to double
precision, I think it's better to explicitly say that we expected
valid string representation of double precision type.

Second patch forbids to convert NaN using .double() method.  As I get,
NaN can't be result of any jsonpath computations assuming there is no
NaN input.  So, I just put an assert to convertJsonbScalar() ensuring
there is no NaN in JsonbValue.

------
Regards,
Alexander Korotkov

Вложения

Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Mon, Jul 6, 2020 at 3:19 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> The patchset is attached, sorry for the delay.
>
> The first patch improves error messages, which appears to be unclear
> for me.  If one applies .double() method to a numeric value, we
> restrict that this numeric value should fit to double precision type.
> If it doesn't fit, the current error message just says the following.
>
> ERROR: jsonpath item method .double() can only be applied to a numeric value
>
> But that's confusing, because .double() method is naturally applied to
> a numeric value.  Patch makes this message explicitly report that
> numeric value is out of range for double type.  This patch also adds
> test exercising this error.  When string can't be converted to double
> precision, I think it's better to explicitly say that we expected
> valid string representation of double precision type.
>
> Second patch forbids to convert NaN using .double() method.  As I get,
> NaN can't be result of any jsonpath computations assuming there is no
> NaN input.  So, I just put an assert to convertJsonbScalar() ensuring
> there is no NaN in JsonbValue.

I'm going to push 0002 if there is no objection.

Regarding 0001, I think my new error messages need review.

------
Regards,
Alexander Korotkov



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> I'm going to push 0002 if there is no objection.
> Regarding 0001, I think my new error messages need review.

I do intend to review these, just didn't get to it yet.

            regards, tom lane



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Wed, Jul 8, 2020 at 1:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > I'm going to push 0002 if there is no objection.
> > Regarding 0001, I think my new error messages need review.
>
> I do intend to review these, just didn't get to it yet.

OK, that you for noticing.  I wouldn't push anything before your review.

------
Regards,
Alexander Korotkov



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> The patchset is attached, sorry for the delay.

> The first patch improves error messages, which appears to be unclear
> for me.  If one applies .double() method to a numeric value, we
> restrict that this numeric value should fit to double precision type.
> If it doesn't fit, the current error message just says the following.

> ERROR: jsonpath item method .double() can only be applied to a numeric value

> But that's confusing, because .double() method is naturally applied to
> a numeric value.  Patch makes this message explicitly report that
> numeric value is out of range for double type.  This patch also adds
> test exercising this error.  When string can't be converted to double
> precision, I think it's better to explicitly say that we expected
> valid string representation of double precision type.

I see your point here, but the English of the replacement error messages
could be improved.  I suggest respectively

numeric argument of jsonpath item method .%s() is out of range for type double precision

string argument of jsonpath item method .%s() is not a valid representation of a double precision number

As for 0002, I'd rather see the convertJsonbScalar() code changed back
to the way it was, ie just

         case jbvNumeric:
             numlen = VARSIZE_ANY(scalarVal->val.numeric);
             padlen = padBufferToInt(buffer);
            ...

There is no argument for having an its-not-NaN assertion here when
there aren't similar assertions throughout the jsonb code.

Also, it seems like it'd be smart to reject isinf() and isnan() results
from float8in_internal_opt_error in both executeItemOptUnwrapTarget code
paths, ie numeric source as well as string source.  Yeah, we don't expect
to see those cases in a jbvNumeric (so I wouldn't change the error message
text), but it's cheap insurance.

No other comments.

            regards, tom lane



Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jul 9, 2020 at 1:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > The patchset is attached, sorry for the delay.
>
> > The first patch improves error messages, which appears to be unclear
> > for me.  If one applies .double() method to a numeric value, we
> > restrict that this numeric value should fit to double precision type.
> > If it doesn't fit, the current error message just says the following.
>
> > ERROR: jsonpath item method .double() can only be applied to a numeric value
>
> > But that's confusing, because .double() method is naturally applied to
> > a numeric value.  Patch makes this message explicitly report that
> > numeric value is out of range for double type.  This patch also adds
> > test exercising this error.  When string can't be converted to double
> > precision, I think it's better to explicitly say that we expected
> > valid string representation of double precision type.
>
> I see your point here, but the English of the replacement error messages
> could be improved.  I suggest respectively
>
> numeric argument of jsonpath item method .%s() is out of range for type double precision
>
> string argument of jsonpath item method .%s() is not a valid representation of a double precision number

Good, thank you for corrections!

> As for 0002, I'd rather see the convertJsonbScalar() code changed back
> to the way it was, ie just
>
>                 case jbvNumeric:
>                         numlen = VARSIZE_ANY(scalarVal->val.numeric);
>                         padlen = padBufferToInt(buffer);
>                         ...
>
> There is no argument for having an its-not-NaN assertion here when
> there aren't similar assertions throughout the jsonb code.
>
> Also, it seems like it'd be smart to reject isinf() and isnan() results
> from float8in_internal_opt_error in both executeItemOptUnwrapTarget code
> paths, ie numeric source as well as string source.  Yeah, we don't expect
> to see those cases in a jbvNumeric (so I wouldn't change the error message
> text), but it's cheap insurance.

OK, corrected as you proposed.

> No other comments.

Revised patches are attached.

I understand both patches as fixes and propose to backpatch them to 12
if no objections.

------
Regards,
Alexander Korotkov

Вложения

Re: jsonpath versus NaN

От
Alexander Korotkov
Дата:
On Thu, Jul 9, 2020 at 4:04 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I understand both patches as fixes and propose to backpatch them to 12
> if no objections.

Both patches are pushed.

------
Regards,
Alexander Korotkov



Re: jsonpath versus NaN

От
Tom Lane
Дата:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> On Thu, Jul 9, 2020 at 4:04 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> I understand both patches as fixes and propose to backpatch them to 12
>> if no objections.

> Both patches are pushed.

Thanks for taking care of that!

            regards, tom lane