Re: jsonpath

Поиск
Список
Период
Сортировка
От Nikita Glukhov
Тема Re: jsonpath
Дата
Msg-id 9a175c2d-b0a9-475f-078b-febc185282ff@postgrespro.ru
обсуждение исходный текст
Ответ на Re: jsonpath  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: jsonpath  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: jsonpath  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 29.10.2018 2:20, Tomas Vondra wrote:
On 10/02/2018 04:33 AM, Michael Paquier wrote:
On Sat, Sep 08, 2018 at 02:21:27AM +0300, Nikita Glukhov wrote:
Attached 18th version of the patches rebased onto the current master.
Nikita, this version fails to apply, as 0004 has conflicts with some
regression tests.  Could you rebase?  I am moving the patch to CF
2018-11, waiting for your input.
--
Michael

As Michael mentioned, the patch does not apply anymore, so it would be
good to provide a rebased version for CF 2018-11. I've managed to do
that, as the issues are due to minor bitrot, so that I can do some quick
review of the current version.
Sorry that I failed to provide rebased version earlier. 

Attached 19th version of the patches rebased onto the current master.
I haven't done much testing, pretty much just compiling, running the
usual regression tests and reading through the patches. I plan to do
more testing once the rebased patch is submitted.

Now, a couple of comments based on eye-balling the diffs.
Thank you for your review.
1) There are no docs, which makes it pretty much non-committable for
now. I know there is [1] and it was a good intro for the review, but we
need something as a part of our sgml docs.
I think that jsonpath part of documentation can be extracted from [2] and
added to the patch set.

2) 0001 says this:
   *typmod = -1; /* TODO implement FF1, ..., FF9 */

but I have no idea what FF1 or FF9 are. I guess it needs to be
implemented, or explained better.
FF1-FF9 are standard datetime template fields used for specifying of fractional
seconds.  FF3/FF6 are analogues of our MS/US.  I decided simply to implement 
this feature (see patch 0001, I also can supply it in the separate patch).

But FF7-FF9 are questionable since the maximal supported precision is only 6.
They are optional by the standard:
 95) Specifications for Feature F555, “Enhanced seconds precision”:   d) Subclause 9.44, “Datetime templates”:     i) Without Feature F555, “Enhanced seconds precision”,        a <datetime template fraction> shall not be FF7, FF8 or FF9.

So I decided to allow FF7-FF9 only on the output in to_char().

3) The makefile rule for scan.o does this:

+# Latest flex causes warnings in this file.
+ifeq ($(GCC),yes)
+scan.o: CFLAGS += -Wno-error
+endif

That seems a bit ugly, and we should probably try to make it work with
the latest flex, instead of hiding the warnings. I don't think we have
any such ad-hoc rules in other Makefiles. If we really need it, can't we
make it part of configure, and/or restrict it depending on flex version?
These lines seem to belong to the earliest versions of our jsonpath
implementation. There is no scan.o file now at all, there is only 
jsonpath_scan.o, so I simply removed these lines.

4) There probably should be .gitignore rule for jsonpath_gram.h, just
like for other generated header files.
I see 3 rules in /src/backend/utils/adt/.gitignore:
/jsonpath_gram.h
/jsonpath_gram.c
/jsonpath_scan.c
5) jbvType says jbvDatetime is "virtual type" but does not explain what
it is. IMHO that deserves a comment or something.
"Virtual type" means here that it is used only for in-memory processing and 
converted into JSON string when outputted to jsonb.  Corresponding comment 
was added.

6) I see the JsonPath definition says this about header:
   /* just version, other bits are reservedfor future use */

but the very next thing it does is defining two pieces stored in the
header - version AND "lax" mode flag. Which makes the comment invalid
(also, note the missing space after "reserved").
Fixed.
FWIW, I'd use JSONPATH_STRICT instead of JSONPATH_LAX. The rest of the
codebase works with "strict" flags passed around, and it's easy to
forget to negate the flag somewhere (at least that's my experience).
Jsonpath lax/strict mode flag is used only in executeJsonPath() where it is 
saved  in "laxMode" field.  New "strict" flag passed to datetime functions 
is unrelated to jsonpath.

7) I see src/include/utils/jsonpath_json.h adds support for plain json
by undefining various jsonb macros and redirecting them to the json
variants. I find that rather suspicious - imagine you're investigating
something in code using those jsonb macros, and wondering why it ends up
calling the json stuff. I'd be pretty confused ...
I agree, this is rather simple but doubtful solution.  That's why json support
was in a separate patch until the 18th version of the patches.

But if we do not want to compile jsonpath.c twice with different definitions,
then we need some kind of run-time wrapping over json strings and jsonb
containers, which seems a bit harder to implement.

To simplify debugging I can also suggest to explicitly preprocess jsonpath.c
into jsonpath_json.c using redefinitions from jsonpath_json.h before its
compilation.
Also, some of the redefinitions are not really needed for example
JsonbWrapItemInArray and JsonbWrapItemsInArray are not used anywhere
(and neither are the json variants).
These definitions will be used in the next patches, so I removed them 
for now from this patch.
8) I see 0002 defines IsAJsonbScalar like this:
 #define IsAJsonbScalar(jsonbval)  (((jsonbval)->type >= jbvNull && \                                    (jsonbval)->type <= jbvBool) || \                                    (jsonbval)->type == jbvDatetime)

while 0004 does this
 #define jspIsScalar(type) ((type) >= jpiNull && (type) <= jpiBool)

I know those are for different data types (jsonb vs. jsonpath), but I
suppose jspIsScalar should include the datetime too.
jpiDatetime does not mean the same thing as jbvDatetime.
jpiDatetime is a representation of ".datetime()" item method.
jbvDatetime is a representation of datetime SQL/JSON items.

Also datetime SQL/JSON items can be represented in JSON path only by strings
with ".datetime()" method applied, they do not have their own literal:
'"2018-01-10".datetime()'   
FWIW JsonPathItemType would deserve better documentation what the
various items mean (a comment for each line would be enough). I've been
wondering if "jpiDouble" means scalar double value until I realized
there's a ".double()" function for paths.
I added per-item comments.

9) It's generally a good idea to make the individual pieces committable
separately, but that means e.g. the regression tests have to pass after
each patch. At the moment that does not seem to be the case for 0002,
see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
not sure if that's related.
This should definitely be a bug in json support, but I can't reproduce 
it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide 
a stack trace at least?


[1] https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
[2] https://github.com/postgrespro/sqljson/tree/sqljson_doc

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
   
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: PostgreSQL vs SQL/XML Standards
Следующее
От: Andrew Gierth
Дата:
Сообщение: Re: Optimizing nested ConvertRowtypeExpr execution