Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

Поиск
Список
Период
Сортировка
От Joseph Adams
Тема Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Дата
Msg-id AANLkTi=2fhSXy5kKS8PEKkAC8G_dfuHGct7e=zaK6pFN@mail.gmail.com
обсуждение исходный текст
Ответ на Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)  (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Ответы Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)  (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Список pgsql-hackers
On Fri, Sep 17, 2010 at 8:32 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Fri, Aug 13, 2010 at 7:33 PM, Joseph Adams
> <joeyadams3.14159@gmail.com> wrote:
>> Updated patch:  the JSON code has all been moved into core, so this
>> patch is now for a built-in data type.
>
> I have a question about the design of the JSON type. Why do we need to
> store the value in UTF8 encoding? It's true the RFC of JSON says the
> the encoding SHALL be encoded in Unicode, but I don't understand why
> we should reject other encodings.

Actually, the code in my original patch should work with any server
encoding in PostgreSQL.  However, internally, it operates in UTF-8 and
converts to/from the server encoding when necessary.  I did it this
way because the JSON code needs to handle Unicode escapes like
"\u266B", but there is no simple and efficient way (that I know of) to
convert single characters to/from the server encoding.

I noticed that in your new patch, you sidestepped the encoding issue
by simply storing strings in their encoded form (right?).  This is
nice and simple, but in the future, JSON tree conversions and updates
will still need to deal with the encoding issue somehow.

> As I said before, I'd like to propose only 3 features in the commitfest:
>  * TYPE json data type
>  * text to json: FUNCTION json_parse(text)
>  * json to text: FUNCTION json_stringify(json, whitelist, space)

Although casting from JSON to TEXT does "stringify" it in my original
patch, I think json_stringify would be much more useful.  In addition
to the formatting options, if the internal format of the JSON type
changes and no longer preserves original formatting, then the behavior
of the following would change:

$$    "unnecessary\u0020escape" $$ :: JSON :: TEXT

json_stringify would be more predictable because it would re-encode
the whitespace (but not the \u0020, unless we went out of our way to
make it do that).

Also, json_parse is "unnecessary" if you allow casting from TEXT to
JSON (which my patch does), but I think having json_parse would be
more intuitive for the same reason you do.

Long story short: I like it :-)  If you're keeping track, features
from my patch not in the new code yet are:* Programmatically validating JSON ( json_validate() )* Getting the type of a
JSONvalue ( json_type() )* Converting scalar values to/from JSON* Converting arrays to JSON* JSONPath
 


> JSONPath will be re-implemented on the basic functionalities in the
> subsequent commitfest. Do you have a plan to split your patch?
> Or, can I continue to develop my patch? If so, JSONPath needs
> to be adjusted to the new infrastructure.

I think your patch is on a better footing than mine, so maybe I should
start contributing to your code rather than the other way around.
Before the next commitfest, I could merge the testcases from my patch
in and identify parsing discrepancies (if any).  Afterward, I could
help merge the other features into the new JSON infrastructure.

I can't compile your initial patch against the latest checkout because
json_parser.h and json_scanner.h are missing.  Is there a more recent
patch, or could you update the patch so it compiles?   I'd like to
start tinkering with the new code.  Thanks!


Joey Adams


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Report: removing the inconsistencies in our CVS->git conversion
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Report: removing the inconsistencies in our CVS->git conversion