[PATCH] Generalized JSON output functions

Поиск
Список
Период
Сортировка
От Shulgin, Oleksandr
Тема [PATCH] Generalized JSON output functions
Дата
Msg-id CACACo5TjBAHYRJJc-U788DnfXbbPasX_nGUHFd+=cehA6SRs-w@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PATCH] Generalized JSON output functions  (Merlin Moncure <mmoncure@gmail.com>)
Re: [PATCH] Generalized JSON output functions  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Hi, Hackers!

Attached is a patch against master to generalize the JSON-producing functions in utils/adt/json.c and to provide a set of callbacks which can be overridden the same way that is already provided for *parsing* JSON.

The motivation behind this to be able to produce specially-crafted JSON in a logical replication output plugin, such that numeric (and bigint) values are quoted.  This requirement, in turn, arises from the fact that JavaScript specification, which is quite natural to expect as a consumer for this JSON data, allows to silently drop significant digits when converting from string to number object.

I believe this is a well-known problem and I'm aware of a number of tricks that might be used to avoid it, but none of them seems to be optimal from my standpoint.

I can also imagine this can be used to convert date/time to string differently, or adding indentation depending on the depth in object hierarchy, etc.

What this patch does apart from providing callbacks, is abstracting most of code for producing the correct JSON structure, which was previously scattered and repeated in a number of functions with slight differences.  In the current code there are 5 styles for producing JSON object string, differing in whitespace only:

a) no spaces

select to_json(row(1,2));
     to_json    
-----------------
 {"f1":1,"f2":2}

b) some spaces (hstore_to_json)

select hstore(row(1,2))::json;
         hstore        
------------------------
 {"f1": "1", "f2": "2"}

c) spaces around colon

select json_build_object('f1',1,'f2',2);
  json_build_object  
----------------------
 {"f1" : 1, "f2" : 2}

d) spaces around colon *and* curly braces

select json_object_agg(x,x) from unnest('{1,2}'::int[]) x;
   json_object_agg    
----------------------
 { "1" : 1, "2" : 2 }

e) line feeds (row_to_json_pretty)

select row_to_json(row(1,2), true) as row;
   row    
----------
 {"f1":1,+
  "f2":2}

Personally, I think we should stick to (b), however that would break a lot of test cases that already depend on (a).  I've tried hard to minimize the amount of changes in expected/json.out, but it is quickly becomes cumbersome trying to support all of the above formats.  So I've altered (c) and (d) to look like (b), naturally only whitespace was affected.

There's one corner case I don't see a sensible way to support:

 select json_agg(x) from generate_series(1,5) x;
    json_agg    
-----------------
 [1, 2, 3, 4, 5]

With the patch applied it puts line feeds between the array elements instead of spaces.

What also bothers me is that I've hard-coded output function oids for cstring_out, and textout on assumption that they never change, but would like to know that for sure.

Feedback is very much welcome!
--
Alex

PS: using a different email address this time, same Alex Shulgin. ;-)

PPS: sample code for mentioned use case with quoting numeric and bigint:

static void out_value_quote_numerics(JsonOutContext *out, Datum val,
                      JsonTypeCategory tcategory, Oid typoid, Oid outfuncoid,
                      bool key_scalar) {
    char *outputstr;

    if (typoid == INT8OID || typoid == NUMERICOID) {
        out->before_value(out);

        outputstr = OidOutputFunctionCall(outfuncoid, val);
        escape_json(&out->result, outputstr);
        pfree(outputstr);

        out->after_value(out, key_scalar);
    } else {
        json_out_value(out, val, tcategory, typoid, outfuncoid, key_scalar);
    }
}
...
json_out_init_context(out, JSON_OUT_USE_SPACES);
out->value = out_value_quote_numerics;

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: a few thoughts on the schedule
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Change pg_cancel_*() to ignore current backend