Re: Using JSONB directly from application

Поиск
Список
Период
Сортировка
От Christian Ohler
Тема Re: Using JSONB directly from application
Дата
Msg-id CAOsiKEL7+KkV0C_hAJWxqwTg+PYVfiGPQ0yjFww7ECtqwBjb+Q@mail.gmail.com
обсуждение исходный текст
Ответ на Using JSONB directly from application  (Anthony Communier <anthony.communier@gmail.com>)
Ответы Re: Using JSONB directly from application  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
(continuing an old thread from https://www.postgresql.org/message-id/CAMsr%2BYEtamQYZ5EocsuthQCvyvmRnQrucDP6GZynPtf0gsMbuw%40mail.gmail.com )

Craig Ringer <craig(at)2ndquadrant(dot)com>, 2018-02-26:

> On 26 February 2018 at 04:05, Anthony Communier <anthony(dot)communier(at)gmail(dot)com> wrote:

> > It would be nice if application connected to a Postrgesql database could
> > send and receive JSONB in binary. It could save some useless text
> > conversion. All works could be done on application side which are often
> > more scalable than database itself.
>
> To support this, you'd need to extract PostgreSQL's jsonb support into a C
> library that could be used independently of backend server infrastructure
> like 'palloc' and memory contexts, ereport(), etc. Or write a parallel
> implementation.

At Shift, we also have use cases that would likely be sped up quite a bit if we could avoid the conversion from JSONB to JSON, and instead pass binary JSONB to the application side and parse it there (in Go).  I doubt we'd want to reuse any of Postgres's C code, and would instead go with your "parallel implementation" idea; I can't imagine it being particularly difficult to implement a JSONB parser from scratch.

All we need, I think, is a Postgres function raw_jsonb(jsonb) that returns bytea but is the identity function at the byte level.  (Or allow a cast from jsonb to bytea.)

Our Go code would then send queries like SELECT col1, col2, raw_jsonb(col3) FROM table1 WHERE ...; I haven't thought in depth about how we'd parse the JSONB in Go, but perhaps we can synthesize a stream of JSON tokens from the binary JSONB (one token at a time, to avoid copies and allocations) and adapt the streaming parser https://github.com/json-iterator/go to turn it into Go values.

Sending raw JSONB to Postgres might also be interesting, but I'd start with receiving.

Would implementing raw_jsonb be as trivial as it sounds?  What about usages like SELECT raw_jsonb(col3->'foo'); does the subobject returned by '->' share structure with the containing object, making the conversion to a self-contained JSONB value less direct?

Can these conversions be implemented without copying the bytes?


An open question about the API contract would be how raw_jsonb would be affected if Postgres introduces a version 2 of JSONB encoding.  My intuition is to punt that problem to the application, and define that raw_jsonb returns whatever version of JSONB is most convenient for Postgres for that particular datum; this minimizes conversion work on the Postgres side, which is the purpose of the mechanism.  Applications that want a stable format can use the conventional textual JSON format.  But I could see a case for making the function raw_jsonb(int, jsonb) and allowing the caller to specify what (maximum?) version of JSONB they want.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: bug with expression index on partition
Следующее
От: David Rowley
Дата:
Сообщение: Re: Speeding up INSERTs and UPDATEs to partitioned tables