Re: JSONB spaces in text presentation

Поиск
Список
Период
Сортировка
От Ilya I. Ashchepkov
Тема Re: JSONB spaces in text presentation
Дата
Msg-id CALgmQD_hcSsHPukyyWo7T1wm6ty82FCGGCOSqZwvLSexE5OZPQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: JSONB spaces in text presentation  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
IMHO, prettification is useful only for debugging.
It would be nice to have a session variable for the debug output with spaces, new lines and indentation.

On Wed, Sep 24, 2014 at 8:44 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Sep 24, 2014 at 2:44 AM, Ilya I. Ashchepkov <koctep@gmail.com> wrote:
> I'm sorry about sending email several times. I haven't understand, was it
> sent by gmail or not.
>
>
> On Wed, Sep 24, 2014 at 2:30 PM, John R Pierce <pierce@hogranch.com> wrote:
>>
>> On 9/24/2014 12:23 AM, Ilya I. Ashchepkov wrote:
>>>
>>>
>>> Is spaces is necessary in text presentation of JSONB?
>>> In my data resulting text contains ~12% of spaces.
>>
>>
>> can you show us an example of this?
>
>
> One record
> # select data from events.data limit 1;
> {"can": {"lls": {"1": 76.4}, "mhs": 4674.85, "rpm": 168.888, "speed": 74,
> "runned": 166855895, "fuel_consumption": 74213.5}, "crc": 10084, "gps": 1,
> "gsm": {"signal": 100}, "lls": {"1": 733, "2": 717}, "used": 19, "speed":
> 87.4, "valid": 1, "msg_id": 89, "runned": 72.75, "boot_no": 256, "digital":
> {"in": {"1": 1, "2": 0, "3": 0, "4": 0, "5": 0, "6": 0}, "out": {"1": 0,
> "2": 0}}, "visible": 20, "ignition": 1, "location": {"course": 265,
> "altitude": 143, "latitude": 55.127888997395836, "longitude":
> 80.8046142578125}, "protocol": 4, "coldstart": 1, "timesource": "terminal",
> "receiver_on": 1, "external_power": 28.07, "internal_power": 4.19}
>
> Whitespacis percents in this record:
> # select array_length(regexp_split_to_array(data::text, text ' '),
> 1)*100./length(data::text) from events.data limit 1;
>       ?column?
> ---------------------
>  12.3417721518987342
>
> Whitespace in test data
>  # select count(*),avg(array_length(regexp_split_to_array(data::text, text '
> '), 1)*100./length(data::text)) from events.data ;
>  count  |         avg
> --------+---------------------
>  242222 | 12.3649234646118312


For jsonb (unlike json), data is not actually stored as json but in a
binary format.  It will generally be much larger than the text
representation in fact but in exchange for that many operations will
be faster.  The spaces you see are generated when the jsonb type is
converted to text for output.  I actually think it's pretty reasonable
to want to redact all spaces from such objects in all cases where
converstion to text happens (output functions, xxxto_json, etc)
because ~12% savings are nothing to sneeze at when moving large
documents in and out of the database.

On the flip side, a more verbose prettification would be pretty nice
too.  I wonder if a hypothetical GUC is the best way to control this
behavior...

merlin



--
С уважением,
Ащепков Илья koctep@gmail.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Daniel Lenski
Дата:
Сообщение: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Question about row_number() ordering semantics