Обсуждение: Best way to populate nested composite type from JSON`

Поиск
Список
Период
Сортировка

Best way to populate nested composite type from JSON`

От
Chris Travers
Дата:
Hi everyone;

I have been starting to experiment with the JSON accessors in 9.3.  I immediately found that my preferred use, populating nested composite types, is not supported.  Also of course one cannot manipulate JSON nodes, which leads to the question of how best to do this.  I had some ideas:

1.  A custom C function utilizing the internal functions PostgreSQL offers for JSON parsing (probably looking at the json_populate_record() as a base)

2.  Utilizing hstore as an intermediate representation along with json_each()

3.  Perhaps using something like pl/perlu or the like.

This leads to a few questions, including whether it would be worthwhile looking at trying to handle nested types in json_populate_record() and if anyone is working on such a thing currently.

--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Best way to populate nested composite type from JSON`

От
Merlin Moncure
Дата:
On Wed, Sep 11, 2013 at 11:55 PM, Chris Travers <chris.travers@gmail.com> wrote:
> Hi everyone;
>
> I have been starting to experiment with the JSON accessors in 9.3.  I
> immediately found that my preferred use, populating nested composite types,
> is not supported.  Also of course one cannot manipulate JSON nodes, which
> leads to the question of how best to do this.  I had some ideas:
>
> 1.  A custom C function utilizing the internal functions PostgreSQL offers
> for JSON parsing (probably looking at the json_populate_record() as a base)
>
> 2.  Utilizing hstore as an intermediate representation along with
> json_each()
>
> 3.  Perhaps using something like pl/perlu or the like.
>
> This leads to a few questions, including whether it would be worthwhile
> looking at trying to handle nested types in json_populate_record() and if
> anyone is working on such a thing currently.

Can we see a hypothetical example?  json best practices for me are to
use standard tables and than serialize/deserialize json as it goes
through the door and not use nested composite types in your actual
table definitions.  I think it should all fit if you do it right: you
json_[each/array_elements] the wrapping json then insert it into
un-nested actual tables.  If you must have a final destination as a
nested composite, I think at the worst case you might have to make
some transition composites...

merlin


Re: Best way to populate nested composite type from JSON`

От
Chris Travers
Дата:



On Thu, Sep 12, 2013 at 8:47 AM, Merlin Moncure <mmoncure@gmail.com> wrote:


Can we see a hypothetical example?  json best practices for me are to
use standard tables and than serialize/deserialize json as it goes
through the door and not use nested composite types in your actual
table definitions.  I think it should all fit if you do it right: you
json_[each/array_elements] the wrapping json then insert it into
un-nested actual tables.  If you must have a final destination as a
nested composite, I think at the worst case you might have to make
some transition composites...

Here is what I am thinking (roughly):

CREATE TYPE trans_line (
     description text,
     amount numeric,
     account_id int
);

CREATE TYPE journal_entry (
      reference text,
      description text,
      post_date date,
      lines trans_line[]
);

CREATE FUNCTION save(journal_entry) RETURNS .... LANGUAGE PLPGSQL AS
$$
....
$$;

What I would like to be able to do is accept that complex data type in as a JSON object and convert it to the record format.  The data could then be broken apart, inserted checked for validity (make sure the transaction is balanced etc) and then inserted into the correct tables.  My idea was to create a json::journal_entry cast and use that.  Otherwise I get to have fun with row and array constructors and system tables.

However one can't pass nested JSON objects through json_populate_record().  My question is basically how to get around that.

Best wishes,
Chris Travers

merlin



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Best way to populate nested composite type from JSON`

От
Merlin Moncure
Дата:
On Thu, Sep 12, 2013 at 7:25 PM, Chris Travers <chris.travers@gmail.com> wrote:
>
>
>
> On Thu, Sep 12, 2013 at 8:47 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>>
>>
>> Can we see a hypothetical example?  json best practices for me are to
>> use standard tables and than serialize/deserialize json as it goes
>> through the door and not use nested composite types in your actual
>> table definitions.  I think it should all fit if you do it right: you
>> json_[each/array_elements] the wrapping json then insert it into
>> un-nested actual tables.  If you must have a final destination as a
>> nested composite, I think at the worst case you might have to make
>> some transition composites...
>
>
> Here is what I am thinking (roughly):
>
> CREATE TYPE trans_line (
>      description text,
>      amount numeric,
>      account_id int
> );
>
> CREATE TYPE journal_entry (
>       reference text,
>       description text,
>       post_date date,
>       lines trans_line[]
> );
>
> CREATE FUNCTION save(journal_entry) RETURNS .... LANGUAGE PLPGSQL AS
> $$
> ....
> $$;
>
> What I would like to be able to do is accept that complex data type in as a
> JSON object and convert it to the record format.  The data could then be
> broken apart, inserted checked for validity (make sure the transaction is
> balanced etc) and then inserted into the correct tables.  My idea was to
> create a json::journal_entry cast and use that.  Otherwise I get to have fun
> with row and array constructors and system tables.
>
> However one can't pass nested JSON objects through json_populate_record().
> My question is basically how to get around that.

It would certainly be better if we could straight up deserialize json
into a nested structure.  For now, my advise is to try and structure
your json and the receiving types/tables to not be nested.  Using your
example, I was able to do that by breaking the object in the json and
eploiting hte 'best effort' json->object mapping (if you wanted to
support receiving more than one entry at a time, you would need to
include relation information to hook journal_entry to trans_lines.

First, let's state the problem:
CREATE TYPE trans_line AS (
     description text,
     amount numeric,
     account_id int
);

CREATE TYPE journal_entry AS(
      reference text,
      description text,
      post_date date,
      lines trans_line[]
);

/* make some test data */
postgres=# SELECT row('a', 'b', now(), array[row('c', 1.0, 2),
row('d', 3.0, 4)]::trans_line[])::journal_entry;
                       row
--------------------------------------------------
 (a,b,2013-09-13,"{""(c,1.0,2)"",""(d,3.0,4)""}")


/* uh oh */
postgres=# select json_populate_record(null::journal_entry,
row_to_json(row('a', 'b', now(), array[row('c', 1.0, 2), row('d', 3.0,
4)]::trans_line[])::journal_entry));
ERROR:  cannot call json_populate_record on a nested object

/* json -> record mapping is 'best effort' */
postgres=# select json_populate_record(null::journal_entry, '{}');
 json_populate_record
----------------------
 (,,,)

/* so, push the entry and the lines into sub-elements of a wrapping
object, de-serialize separately and map back together. */


WITH in_json AS
(
  SELECT row_to_json(q) AS data
  FROM
  (
    SELECT
      q::journal_entry AS journal_entries,
      ARRAY[ROW('c', 1.0, 2), ROW('d', 3.0, 4)]::trans_line[] AS trans_lines
    FROM
    (
      SELECT
        'a',
        'b',
        now(),
        null::trans_line[]
    ) q
  ) q
),
je AS
(
  SELECT
    q.*
  FROM in_json
  CROSS JOIN LATERAL
    json_populate_record(null::journal_entry, data->'journal_entries') q
),
tl AS
(
  SELECT
    q.*
  FROM in_json
  CROSS JOIN LATERAL
    json_populate_recordset(null::trans_line, data->'trans_lines') q
)
SELECT je.*, ARRAY(SELECT tl FROM tl) AS lines2 FROM je;

 reference | description | post_date  | lines |          lines2
-----------+-------------+------------+-------+---------------------------
 a         | b           | 2013-09-13 |       | {"(c,1.0,2)","(d,3.0,4)"}

that's somewhat tedious, but not too bad I think (the query above
looks longer than it would be in practice since the json construction
would presumably be on the client).  But the basic M.O. is to send
lists of records back to the server and relate them on the fly.

merlin


Re: Best way to populate nested composite type from JSON`

От
Chris Travers
Дата:



On Fri, Sep 13, 2013 at 6:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote:


It would certainly be better if we could straight up deserialize json
into a nested structure.  For now, my advise is to try and structure
your json and the receiving types/tables to not be nested.  Using your
example, I was able to do that by breaking the object in the json and
eploiting hte 'best effort' json->object mapping (if you wanted to
support receiving more than one entry at a time, you would need to
include relation information to hook journal_entry to trans_lines.


First, let's state the problem:
CREATE TYPE trans_line AS (
     description text,
     amount numeric,
     account_id int
);

CREATE TYPE journal_entry AS(
      reference text,
      description text,
      post_date date,
      lines trans_line[]
);

/* make some test data */
postgres=# SELECT row('a', 'b', now(), array[row('c', 1.0, 2),
row('d', 3.0, 4)]::trans_line[])::journal_entry;
                       row
--------------------------------------------------
 (a,b,2013-09-13,"{""(c,1.0,2)"",""(d,3.0,4)""}")


/* uh oh */
postgres=# select json_populate_record(null::journal_entry,
row_to_json(row('a', 'b', now(), array[row('c', 1.0, 2), row('d', 3.0,
4)]::trans_line[])::journal_entry));
ERROR:  cannot call json_populate_record on a nested object

/* json -> record mapping is 'best effort' */
postgres=# select json_populate_record(null::journal_entry, '{}');
 json_populate_record
----------------------
 (,,,)

Right.  My first thinking was to use json_each and hstore to do an initial filter and processing of the json object. In essence I should be able to take a json object, break it apart into pieces, filter, reassemble as hstore, and then cast to json.  The result is that the nested portions could be filtered out and processed separately.  The casting function could then have some knowledge of which elements might need to be nested and how deeply. 

/* so, push the entry and the lines into sub-elements of a wrapping
object, de-serialize separately and map back together. */


WITH in_json AS
(
  SELECT row_to_json(q) AS data
  FROM
  (
    SELECT
      q::journal_entry AS journal_entries,
      ARRAY[ROW('c', 1.0, 2), ROW('d', 3.0, 4)]::trans_line[] AS trans_lines
    FROM
    (
      SELECT
        'a',
        'b',
        now(),
        null::trans_line[]
    ) q
  ) q
),
je AS
(
  SELECT
    q.*
  FROM in_json
  CROSS JOIN LATERAL
    json_populate_record(null::journal_entry, data->'journal_entries') q
),
tl AS
(
  SELECT
    q.*
  FROM in_json
  CROSS JOIN LATERAL
    json_populate_recordset(null::trans_line, data->'trans_lines') q
)
SELECT je.*, ARRAY(SELECT tl FROM tl) AS lines2 FROM je;

 reference | description | post_date  | lines |          lines2
-----------+-------------+------------+-------+---------------------------
 a         | b           | 2013-09-13 |       | {"(c,1.0,2)","(d,3.0,4)"}

that's somewhat tedious, but not too bad I think (the query above
looks longer than it would be in practice since the json construction
would presumably be on the client).  But the basic M.O. is to send
lists of records back to the server and relate them on the fly.

Ok, but what you are looking at there is structuring your JSON such that every branch is equally nested. What I am trying to do is have composite types which can be detected and used by code generators to generate client-side data objects.  Having a requirement like this is a bit more complex and makes the data structures a bit less natural so I guess the next question is how to try to simply isolate and override these in the JSON itself.  The simple solution might be to use json_each() I suppose.



merlin



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Best way to populate nested composite type from JSON`

От
Merlin Moncure
Дата:
On Fri, Sep 13, 2013 at 9:30 AM, Chris Travers <chris.travers@gmail.com> wrote:
>
>
>
> On Fri, Sep 13, 2013 at 6:37 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>>
>>
>> It would certainly be better if we could straight up deserialize json
>> into a nested structure.  For now, my advise is to try and structure
>> your json and the receiving types/tables to not be nested.  Using your
>> example, I was able to do that by breaking the object in the json and
>> eploiting hte 'best effort' json->object mapping (if you wanted to
>> support receiving more than one entry at a time, you would need to
>> include relation information to hook journal_entry to trans_lines.
>
>
>>
>> First, let's state the problem:
>> CREATE TYPE trans_line AS (
>>      description text,
>>      amount numeric,
>>      account_id int
>> );
>>
>> CREATE TYPE journal_entry AS(
>>       reference text,
>>       description text,
>>       post_date date,
>>       lines trans_line[]
>> );
>>
>> /* make some test data */
>> postgres=# SELECT row('a', 'b', now(), array[row('c', 1.0, 2),
>> row('d', 3.0, 4)]::trans_line[])::journal_entry;
>>                        row
>> --------------------------------------------------
>>  (a,b,2013-09-13,"{""(c,1.0,2)"",""(d,3.0,4)""}")
>>
>>
>> /* uh oh */
>> postgres=# select json_populate_record(null::journal_entry,
>> row_to_json(row('a', 'b', now(), array[row('c', 1.0, 2), row('d', 3.0,
>> 4)]::trans_line[])::journal_entry));
>> ERROR:  cannot call json_populate_record on a nested object
>>
>> /* json -> record mapping is 'best effort' */
>> postgres=# select json_populate_record(null::journal_entry, '{}');
>>  json_populate_record
>> ----------------------
>>  (,,,)
>
>
> Right.  My first thinking was to use json_each and hstore to do an initial
> filter and processing of the json object. In essence I should be able to
> take a json object, break it apart into pieces, filter, reassemble as
> hstore, and then cast to json.  The result is that the nested portions could
> be filtered out and processed separately.  The casting function could then
> have some knowledge of which elements might need to be nested and how
> deeply.
>>
>>
>> /* so, push the entry and the lines into sub-elements of a wrapping
>> object, de-serialize separately and map back together. */
>>
>>
>> WITH in_json AS
>> (
>>   SELECT row_to_json(q) AS data
>>   FROM
>>   (
>>     SELECT
>>       q::journal_entry AS journal_entries,
>>       ARRAY[ROW('c', 1.0, 2), ROW('d', 3.0, 4)]::trans_line[] AS
>> trans_lines
>>     FROM
>>     (
>>       SELECT
>>         'a',
>>         'b',
>>         now(),
>>         null::trans_line[]
>>     ) q
>>   ) q
>> ),
>> je AS
>> (
>>   SELECT
>>     q.*
>>   FROM in_json
>>   CROSS JOIN LATERAL
>>     json_populate_record(null::journal_entry, data->'journal_entries') q
>> ),
>> tl AS
>> (
>>   SELECT
>>     q.*
>>   FROM in_json
>>   CROSS JOIN LATERAL
>>     json_populate_recordset(null::trans_line, data->'trans_lines') q
>> )
>> SELECT je.*, ARRAY(SELECT tl FROM tl) AS lines2 FROM je;
>>
>>  reference | description | post_date  | lines |          lines2
>> -----------+-------------+------------+-------+---------------------------
>>  a         | b           | 2013-09-13 |       | {"(c,1.0,2)","(d,3.0,4)"}
>>
>> that's somewhat tedious, but not too bad I think (the query above
>> looks longer than it would be in practice since the json construction
>> would presumably be on the client).  But the basic M.O. is to send
>> lists of records back to the server and relate them on the fly.
>
>
> Ok, but what you are looking at there is structuring your JSON such that
> every branch is equally nested. What I am trying to do is have composite
> types which can be detected and used by code generators to generate
> client-side data objects.  Having a requirement like this is a bit more
> complex and makes the data structures a bit less natural so I guess the next
> question is how to try to simply isolate and override these in the JSON
> itself.  The simple solution might be to use json_each() I suppose.

Yup.  As things stand currently, it's better *not* to make
serialization-driving composite types which when learning the json
stuff I did heavily; it was a habit I learned (and had to unlearn)
from libpqtypes which solves a lot of the same problems for C clients.
 You can *almost* send json back to the database with the same
elegance as it can be sent out of the database.  I see a big future
for postgres-json with many roadmap improvements; it's a complete game
change for service call driven database interaction and forward
thinking developers should realize it essentially obsoletes ORMS.

merlin


Re: Best way to populate nested composite type from JSON`

От
Chris Travers
Дата:



On Fri, Sep 13, 2013 at 7:58 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

 
Yup.  As things stand currently, it's better *not* to make
serialization-driving composite types which when learning the json
stuff I did heavily; it was a habit I learned (and had to unlearn)
from libpqtypes which solves a lot of the same problems for C clients.
 You can *almost* send json back to the database with the same
elegance as it can be sent out of the database.  I see a big future
for postgres-json with many roadmap improvements; it's a complete game
change for service call driven database interaction and forward
thinking developers should realize it essentially obsoletes ORMS.

merlin


In your view, would it be better to serialize in the pseudo-nested-csv (native row/array) formats?

--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Best way to populate nested composite type from JSON`

От
Merlin Moncure
Дата:
On Fri, Sep 13, 2013 at 10:08 AM, Chris Travers <chris.travers@gmail.com> wrote:
> On Fri, Sep 13, 2013 at 7:58 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> Yup.  As things stand currently, it's better *not* to make
>> serialization-driving composite types which when learning the json
>> stuff I did heavily; it was a habit I learned (and had to unlearn)
>> from libpqtypes which solves a lot of the same problems for C clients.
>>  You can *almost* send json back to the database with the same
>> elegance as it can be sent out of the database.  I see a big future
>> for postgres-json with many roadmap improvements; it's a complete game
>> change for service call driven database interaction and forward
>> thinking developers should realize it essentially obsoletes ORMS.
>
>
>
> In your view, would it be better to serialize in the pseudo-nested-csv
> (native row/array) formats?

What's your client side stack?

merlin


Re: Best way to populate nested composite type from JSON`

От
Chris Travers
Дата:



On Fri, Sep 13, 2013 at 8:51 AM, Merlin Moncure <mmoncure@gmail.com> wrote:


What's your client side stack?

merlin
 
Right now we are using something a little lighter weight in terms db discovery but it doesn't handle this situation very well.  I am the author of the PGObject space on CPAN and currently we are using what was the ancestor of PGObject::Simple::Role which does discovery based on function names and argument names.

The PGObject namespace is basically a namespace for database-driven object oriented mapping systems.  The idea is that you can have an object model defined in the database (and with data logic coded in SQL), which gets discovered and used by the application at run-time.  It is PG-only of course.

Unfortunately the problem is that once you go to any significantly complex data structure, my current discovery efforts fall apart.   This means that the application/db contract becomes more brittle and fewer things can change without having to fix both sides.  For example, currently, for processing payments we have two different implementations which take two different ways of trying to solve this problem and neither tolerates any semantic change to the arguments.  Currently these are among our least maintainable stored procedures.

So my effort is in building a new service locator module (probably something like PGObject::CompositeType) that would pull in properties of a composite type as properties of an object model, and also create methods dynamically based on functions taking that type as a first argument.  The only real problem is serializing an arbitrarily complex composite type to the backend for processing.  If JSON worked it would save me a lot on the client side.  But if not....  The key requirement is that the data structures on the client would be pulled from named composite types on the server.

--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

Re: Best way to populate nested composite type from JSON`

От
Merlin Moncure
Дата:
On Fri, Sep 13, 2013 at 7:52 PM, Chris Travers <chris.travers@gmail.com> wrote:
> On Fri, Sep 13, 2013 at 8:51 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> What's your client side stack?
>>
>> merlin
>
>
> Right now we are using something a little lighter weight in terms db
> discovery but it doesn't handle this situation very well.  I am the author
> of the PGObject space on CPAN and currently we are using what was the
> ancestor of PGObject::Simple::Role which does discovery based on function
> names and argument names.
>
> The PGObject namespace is basically a namespace for database-driven object
> oriented mapping systems.  The idea is that you can have an object model
> defined in the database (and with data logic coded in SQL), which gets
> discovered and used by the application at run-time.  It is PG-only of
> course.
>
> Unfortunately the problem is that once you go to any significantly complex
> data structure, my current discovery efforts fall apart.   This means that
> the application/db contract becomes more brittle and fewer things can change
> without having to fix both sides.  For example, currently, for processing
> payments we have two different implementations which take two different ways
> of trying to solve this problem and neither tolerates any semantic change to
> the arguments.  Currently these are among our least maintainable stored
> procedures.
>
> So my effort is in building a new service locator module (probably something
> like PGObject::CompositeType) that would pull in properties of a composite
> type as properties of an object model, and also create methods dynamically
> based on functions taking that type as a first argument.  The only real
> problem is serializing an arbitrarily complex composite type to the backend
> for processing.  If JSON worked it would save me a lot on the client side.
> But if not....  The key requirement is that the data structures on the
> client would be pulled from named composite types on the server.

Huh. Interesting factoid:  I cowrote (see:
http://libpqtypes.esilo.com/man3/pqt-composites.html) what remains
today the only effective way to send complex types to the server.  For
non-'C' languages, json support will probably be the way to go once
the server can handle json deserialization better.

Right now whenever possible I'm deploying json directly to javascript
heavy applications with ultrathin middlewares with most of the
interesting backend processing (as you envisage) happening in the
database; It's powerful and it works.

merlin