Re: SQL:2023 JSON simplified accessor support
| От | Peter Eisentraut |
|---|---|
| Тема | Re: SQL:2023 JSON simplified accessor support |
| Дата | |
| Msg-id | a8843d0a-8adb-4fdc-9ac8-8efd22f7d29c@eisentraut.org обсуждение исходный текст |
| Ответ на | Re: SQL:2023 JSON simplified accessor support (Alexandra Wang <alexandra.wang.oss@gmail.com>) |
| Список | pgsql-hackers |
On 24.09.25 03:05, Alexandra Wang wrote:
> Thanks for reviewing. I'm glad you like the new approach of
> introducing "transform_partial". I've attached v22, which addresses
> some of your feedback, and I ran pgindent again.
I have committed the patch 0001 (the additional tests).
I don't see the need for the refactoring in patch 0004. The new
function doesn't seem to be used anywhere else. Was this left over
from a previous patch version?
A small code style note: We don't use // comments. pgindent will
remove them, but not in ecpg pgc files. Please fix those in your
patch.
This feature introduces some incompatible changes. These need to be
called out more explicitly and documented. Or we should discuss them
first.
The problem is that
(foo).bar
could be a field reference or a function call. In case of ambiguity,
the field reference interpretation is preferred. This is documented
at
https://www.postgresql.org/docs/devel/rowtypes.html#ROWTYPES-USAGE
With the new feature, data types can define their own field reference
helper functions. But the problem is that this is not resolvable at
parse time. With row types, you can check at parse time what field
names exist, and you can resolve between field names and function
names. With the new feature, a type such as jsonb or hstore, because
of their dynamic nature, would effectively claim that all names are
possible field names, and so the function name interpretation would
never be applicable. This would effectively kill the function call
syntax for those types.
I think we need to think this through a bit more. I don't think we
can get away with just breaking this as is done in this patch. One
possibility would be that the function name interpretation would be
preferred for those types. I don't know.
Another problem is that the existing jsonb array subscripting is
different in some detail from the jsonpath semantics. Some examples
taken from the tests in the patch:
select ('123'::jsonb)[0];
jsonb
-------
(1 row)
select json_query('123'::jsonb, 'lax $[0]' with conditional array wrapper);
json_query
------------
123
(1 row)
--
select ('{"a": 1}'::jsonb)[0];
jsonb
-------
(1 row)
select json_query('{"a": 1}'::jsonb, 'lax $[0]' with conditional array
wrapper);
json_query
------------
{"a": 1}
(1 row)
--
select ('[1, "2", null]'::jsonb)[1][0];
jsonb
-------
(1 row)
select json_query('[1, "2", null]'::jsonb, 'lax $[1][0]' with
conditional array wrapper);
json_query
------------
"2"
(1 row)
AFAICT, the json_query() behavior is correct here according to the
jsonpath specification. But the jsonb subscripting apparently does
not use the "lax" interpretation. (But it also does not use the
"strict" interpretation either, because that should result in an
error?)
--
select ('[1, "2", null]'::jsonb)[1.0];
ERROR: subscript type numeric is not supported
select json_query('[1, "2", null]'::jsonb, 'lax $[1.0]' with conditional
array wrapper);
json_query
------------
"2"
(1 row)
The json_query() behavior appears to be nonstandard here. Out of
curiosity, I also checked
select json_query('[1, "2", null]'::jsonb, 'lax $[1.5]' with conditional
array wrapper);
json_query
------------
"2"
(1 row)
I don't know if this is intentional or sensible. Maybe we should not
allow this.
--
select ('[1, "2", null]'::jsonb)[-2];
jsonb
-------
"2"
(1 row)
select json_query('[1, "2", null]'::jsonb, 'lax $[-2]' with conditional
array wrapper);
json_query
------------
(1 row)
Negative subscripts are a PostgreSQL extension, but it might make
sense if they worked consistently.
--
Since the claim and point of this feature is that the subscripting
should be equivalent to jsonpath, we should work through these cases
and figure out what to do with them. (Maybe change some existing
behavior, or at least document.)
Btw., I noticed a small problem in the SQL standard text. Your commit
message says that the simplified accessor expressions are equivalent
to
JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON
EMPTY NULL ON ERROR)
but I think it should be 'lax $JC' without the period.
About the patch itself, I don't understand the split between patch
0002 (Add an alternative transform function in SubscriptRoutines) and
the rest. Patch 0002 introduces the .transform_partial callback
function but doesn't explain why one might want to use it. There is
no example or test case given. The patch changes the jsonb type to
use that new callback, but this doesn't appear to result in any change
of behavior. This needs to be clarified. Moreover, the comment in
the patch says that the function can handle field references (String
nodes), but AFAICT that functionality is actually introduced in patch
0005.
I'm confused that patch 0005 contains executor changes specific to
jsonpath. The point of having this callback API is that this
functionality is generalized and handled by the callback functions
provided by the types. If this is not sufficient to achieve the
functionality, maybe the API needs to be enhanced further.
I think we need to find a way to take smaller incremental steps on
this whole feature set. For example, earlier in the thread, a patch
was proposed to make the hstore type use this. The hstore type is a
much simpler situation, and it's an extension, so this would be a good
way to build out the API and work out issues related to ambiguities
about field names vs function names and so on. And then build the
jsonb functionality on top of that, while also having worked out the
jsonpath semantics issues.
В списке pgsql-hackers по дате отправления: