Обсуждение: [HACKERS] Jsonb transform for pl/python

Поиск
Список
Период
Сортировка

[HACKERS] Jsonb transform for pl/python

От
Anthony Bykov
Дата:
Hi.
I've implemented jsonb transform
(https://www.postgresql.org/docs/9.5/static/sql-createtransform.html)
for pl/python. 

1. '{"1":1}'::jsonb is transformed into dict {"1"=>1}, while
'["1",2]'::jsonb is transformed into list(not tuple!) ["1", 2]

2. If there is a numeric value appear in jsonb, it will be transformed
to decimal through string (Numeric->String->Decimal). Not the best
solution, but as far as I understand this is usual practise in
postgresql to serialize Numerics and de-serialize them.

3. Decimal is transformed into jsonb through string
(Decimal->String->Numeric).

An example may also be helpful to understand extension. So, as an
example, function "test" transforms incoming jsonb into python,
transforms it back into jsonb and returns it.

create extension jsonb_plpython2u cascade;

create or replace function test(val jsonb)
returns jsonb
transform for type jsonb
language plpython2u
as $$
return (val);
$$;

select test('{"1":1,"example": null}'::jsonb);

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Jsonb transform for pl/python

От
David Fetter
Дата:
On Wed, Oct 25, 2017 at 02:51:00PM +0300, Anthony Bykov wrote:
> Hi.
> I've implemented jsonb transform
> (https://www.postgresql.org/docs/9.5/static/sql-createtransform.html)
> for pl/python. 
> 
> 1. '{"1":1}'::jsonb is transformed into dict {"1"=>1}, while
> '["1",2]'::jsonb is transformed into list(not tuple!) ["1", 2]
> 
> 2. If there is a numeric value appear in jsonb, it will be transformed
> to decimal through string (Numeric->String->Decimal). Not the best
> solution, but as far as I understand this is usual practise in
> postgresql to serialize Numerics and de-serialize them.
> 
> 3. Decimal is transformed into jsonb through string
> (Decimal->String->Numeric).
> 
> An example may also be helpful to understand extension. So, as an
> example, function "test" transforms incoming jsonb into python,
> transforms it back into jsonb and returns it.
> 
> create extension jsonb_plpython2u cascade;

Thanks for your hard work!

Should there also be one for PL/Python3U?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Jsonb transform for pl/python

От
Anthony Bykov
Дата:
On Sun, 29 Oct 2017 19:11:02 +0100
David Fetter <david@fetter.org> wrote:

> Thanks for your hard work!
> 
> Should there also be one for PL/Python3U?
> 
> Best,
> David.
Hi.
Actually, there is one for PL/Python3U. This patch contains following
extensions:
jsonb_plpythonu
jsonb_plpython2u
jsonb_plpython3u
"make install" checks which python major version was your postgresql
configured with and installs corresponding extension.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Jsonb transform for pl/python

От
David Fetter
Дата:
On Mon, Oct 30, 2017 at 11:15:00AM +0300, Anthony Bykov wrote:
> On Sun, 29 Oct 2017 19:11:02 +0100
> David Fetter <david@fetter.org> wrote:
> 
> > Thanks for your hard work!
> > 
> > Should there also be one for PL/Python3U?
> > 
> > Best,
> > David.
> Hi.
> Actually, there is one for PL/Python3U. This patch contains following
> extensions:
> jsonb_plpythonu
> jsonb_plpython2u
> jsonb_plpython3u
> "make install" checks which python major version was your postgresql
> configured with and installs corresponding extension.

My mistake.  Sorry about the noise.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Jsonb transform for pl/python

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hello Anthony,

Great job!

I decided to take a closer look on your patch. Here are some defects I 
discovered.

> +   Additional extensions are available that implement transforms for
> +   the <type>jsonb</type> type for the language PL/Python.  The
> +   extensions for PL/Perl are called

1. The part regarding PL/Perl is obviously from another patch.

2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable, while 
jsonb_plpython3u is not. Is it a mistake? Anyway if an extension is relocatable
there should be a test that checks this.

3. Not all json types are test-covered. Tests for 'true' :: jsonb, '3.14' :: 
jsonb and 'null' :: jsonb are missing.

4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it should be 
"through" or probably even "over".

5. It looks like you've implemented transform in two directions Python <-> 
JSONB, however I see tests only for Python <- JSONB case.

6. Tests passed on Python 2.7.14 but failed on 3.6.2:

>  CREATE EXTENSION jsonb_plpython3u CASCADE;
> + ERROR:  could not access file "$libdir/jsonb_plpython3": No such file or
> directory

module_pathname in jsonb_plpython3u.control should be $libdir/jsonb_plpython3u,
not $libdir/jsonb_plpython3.

Tested on Arch Linux x64, GCC 7.2.0.

The new status of this patch is: Waiting on Author

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Jsonb transform for pl/python

От
Anthony Bykov
Дата:
On Thu, 09 Nov 2017 12:26:46 +0000
Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote:

> The following review has been posted through the commitfest
> application: make installcheck-world:  tested, failed
> Implements feature:       tested, passed
> Spec compliant:           tested, passed
> Documentation:            tested, passed
> 
> Hello Anthony,
> 
> Great job!
> 
> I decided to take a closer look on your patch. Here are some defects
> I discovered.
> 
> > +   Additional extensions are available that implement transforms
> > for
> > +   the <type>jsonb</type> type for the language PL/Python.  The
> > +   extensions for PL/Perl are called  
> 
> 1. The part regarding PL/Perl is obviously from another patch.
> 
> 2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable,
> while jsonb_plpython3u is not. Is it a mistake? Anyway if an
> extension is relocatable there should be a test that checks this.
> 
> 3. Not all json types are test-covered. Tests for 'true' :: jsonb,
> '3.14' :: jsonb and 'null' :: jsonb are missing.
> 
> 4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it
> should be "through" or probably even "over".
> 
> 5. It looks like you've implemented transform in two directions
> Python <-> JSONB, however I see tests only for Python <- JSONB case.
> 
> 6. Tests passed on Python 2.7.14 but failed on 3.6.2:
> 
> >  CREATE EXTENSION jsonb_plpython3u CASCADE;
> > + ERROR:  could not access file "$libdir/jsonb_plpython3": No such
> > file or directory  
> 
> module_pathname in jsonb_plpython3u.control should be
> $libdir/jsonb_plpython3u, not $libdir/jsonb_plpython3.
> 
> Tested on Arch Linux x64, GCC 7.2.0.
> 
> The new status of this patch is: Waiting on Author
> 

Hello, Aleksander. 
Thank you for your time. The defects you have noticed were fixed.
Please, find in attachments new version of the patch (it is called
0001-jsonb_plpython-extension-v2.patch).

Most of changes were made to fix defects(list of the defects may be
found in citation in the beginning of this message), but the algorithm
of iterating through incoming jsonb was changed so that it looks tidier.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: Jsonb transform for pl/python

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi Anthony,

Thank you for the new version of the patch! Here is my code review.

1. In jsonb_plpython2.out:

+CREATE FUNCTION back(val jsonb) RETURNS jsonb
+LANGUAGE plpython2u
+TRANSFORM FOR TYPE jsonb
+as $$
+return val
+$$;
+SELECT back('null'::jsonb);
+  back  
+--------
+ [null]
+(1 row)
+
+SELECT back('1'::jsonb);
+ back 
+------
+ [1]
+(1 row)
+
+SELECT back('true'::jsonb);
+  back  
+--------
+ [true]
+(1 row)

Maybe I'm missing something, but why exactly all JSONB values turn into arrays?

2. Could you please also add tests for some negative and real numbers? Also
could you check that your code handles numbers like MAX_INT, MIN_INT, +/-
infinity and NaN properly in both (Python <-> JSONB) directions?

3. Handling unicode strings properly is another thing that is worth checking.

4. I think we also need some tests that check the behavior of Python -> JSONB
conversion when the object contains data that is not representable in JSON
format, e.g. set(), some custom objects, etc. 

5. PyObject_FromJsonbValue - I realize it's unlikely that the new
jsonbValue->type will be introduced any time soon. Still I believe it's a good
practice to add "it should never happen" default case that just does
elog(ERROR, ...) in case it happens nevertheless. Otherwise in this scenario
instead of reporting the error the code will silently do the wrong thing.

6. Well, you decided to make the extension non-relocatable. Could you at least
describe what prevents it to be relocatable or why it's meaningless is a
comment in .control file? Please note that almost all contrib/ extensions are
relocatable. I believe your extension should be relocatable as well unless
there is a good reason why it can't.

The new status of this patch is: Waiting on Author

Re: Jsonb transform for pl/python

От
Anthony Bykov
Дата:
On Mon, 13 Nov 2017 15:08:16 +0000
Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote:

> The following review has been posted through the commitfest
> application: make installcheck-world:  tested, passed
> Implements feature:       tested, passed
> Spec compliant:           tested, passed
> Documentation:            tested, passed
> 
> Hi Anthony,
> 
> Thank you for the new version of the patch! Here is my code review.
> 
> 1. In jsonb_plpython2.out:
> 
> +CREATE FUNCTION back(val jsonb) RETURNS jsonb
> +LANGUAGE plpython2u
> +TRANSFORM FOR TYPE jsonb
> +as $$
> +return val
> +$$;
> +SELECT back('null'::jsonb);
> +  back  
> +--------
> + [null]
> +(1 row)
> +
> +SELECT back('1'::jsonb);
> + back 
> +------
> + [1]
> +(1 row)
> +
> +SELECT back('true'::jsonb);
> +  back  
> +--------
> + [true]
> +(1 row)
> 
> Maybe I'm missing something, but why exactly all JSONB values turn
> into arrays?
> 
> 2. Could you please also add tests for some negative and real
> numbers? Also could you check that your code handles numbers like
> MAX_INT, MIN_INT, +/- infinity and NaN properly in both (Python <->
> JSONB) directions?
> 
> 3. Handling unicode strings properly is another thing that is worth
> checking.
> 
> 4. I think we also need some tests that check the behavior of Python
> -> JSONB conversion when the object contains data that is not
> representable in JSON format, e.g. set(), some custom objects, etc. 
> 
> 5. PyObject_FromJsonbValue - I realize it's unlikely that the new
> jsonbValue->type will be introduced any time soon. Still I believe
> it's a good practice to add "it should never happen" default case
> that just does elog(ERROR, ...) in case it happens nevertheless.
> Otherwise in this scenario instead of reporting the error the code
> will silently do the wrong thing.
> 
> 6. Well, you decided to make the extension non-relocatable. Could you
> at least describe what prevents it to be relocatable or why it's
> meaningless is a comment in .control file? Please note that almost
> all contrib/ extensions are relocatable. I believe your extension
> should be relocatable as well unless there is a good reason why it
> can't.
> 
> The new status of this patch is: Waiting on Author

Hi,
thank you for your review. I took your comments into account in the
third version of the patch. In the new version, I've added all the
tests you asked for. The interesting thing is that:
1. set or any other non-jsonb-transformable object is transformed into
string and added to jsonb as a string. 
2. couldn't find a solution of working with "inf": this extension
troughs exception if python generates a number called "inf" and returns
it, but if we pass a very large integer into a function, it works fine
with the whole number. This situation can be seen in tests.

I've added tests of large numerics which weights quite heavy. So,
please find it in compressed format in attachments.


--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Jsonb transform for pl/python

От
Aleksander Alekseev
Дата:
Hi Anthony,

> thank you for your review. I took your comments into account in the
> third version of the patch. In the new version, I've added all the
> tests you asked for. The interesting thing is that:
> 1. set or any other non-jsonb-transformable object is transformed into
> string and added to jsonb as a string.

Well frankly I very much doubt that this:

```
+-- set -> jsonb
+CREATE FUNCTION test1set() RETURNS jsonb
+LANGUAGE plpython2u
+TRANSFORM FOR TYPE jsonb
+AS $$
+x = set()
+x.add(1)
+x.add("string")
+x.add(None)
+return x
+$$;
+SELECT test1set();
+          test1set
+----------------------------
+ "set([1, 'string', None])"
+(1 row)
```

... is an expected and valid behavior. If user tries to transform a
set() to JSONB this is most likely a mistake since there is no standard
representation of a set() in JSONB. I believe we should rise an error in
this case instead of generating a string. Besides user can expect that
such string can be transformed back to set() which doesn't sound like a
good idea either.

If necessary, user can just transform a set() to a list():

```
>>> x = set([1,2,3,4])
>>> x
{1, 2, 3, 4}
>>> list(x)
[1, 2, 3, 4]
```

BTW I just recalled that Python supports complex numbers out-of-the box
and that range and xrange are a separate types too:

```
>>> 1 + 2j
(1+2j)
>>> range(3)
range(0, 3)
>>> type(range(3))
<class 'range'>
```

I think we should add all this to the tests as well. Naturally complex
numbers can't be represented in JSON so we should rise an error if user
tries to transform a complex number to JSON. I'm not that sure regarding
ranges though. Probably the best solution will be to rise and error in
this case as well just to keep things consistent.

> +ERROR:  jsonb doesn't support inf type yet

I would say this error message is too informal. How about something more
like "Infinity can't be represented in JSONB"?

> 2. couldn't find a solution of working with "inf": this extension
> troughs exception if python generates a number called "inf" and returns
> it, but if we pass a very large integer into a function, it works fine
> with the whole number. This situation can be seen in tests.
>
> I've added tests of large numerics which weights quite heavy. So,
> please find it in compressed format in attachments.

I'm afraid that tests fail on Python 3:

``` SELECT test1set();        test1set         -----------------------
!  "{None, 1, 'string'}" (1 row)  DROP EXTENSION plpython3u CASCADE;
--- 296,302 ---- SELECT test1set();        test1set         -----------------------
!  "{1, None, 'string'}" (1 row)  DROP EXTENSION plpython3u CASCADE
```

--
Best regards,
Aleksander Alekseev

Re: Jsonb transform for pl/python

От
Michael Paquier
Дата:
On Mon, Nov 20, 2017 at 10:48 PM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
> Well frankly I very much doubt that this:
> [snip]
> I'm afraid that tests fail on Python 3:

So this still needs more work.. I am marking it as returned with
feedback as there has been no updates for more than 1 week.
-- 
Michael


Re: Jsonb transform for pl/python

От
Anthony Bykov
Дата:
Hello,
fixed the issues:
1. Rising errors when invalid object being transformed.
2. We don't rise the exception when transforming range(3) only in
python 2. In third one it is an error.

Please, find the 4-th version of the patch in attachments to this
message. --
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 12/6/17 06:40, Anthony Bykov wrote:
> Hello,
> fixed the issues:
> 1. Rising errors when invalid object being transformed.
> 2. We don't rise the exception when transforming range(3) only in
> python 2. In third one it is an error.
> 
> Please, find the 4-th version of the patch in attachments to this
> message.

Why not make this part of the plpythonu extension?  It doesn't have to
be a separate contrib module.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Anthony Bykov
Дата:
On Sat, 9 Dec 2017 16:57:05 -0500
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

> On 12/6/17 06:40, Anthony Bykov wrote:
> > Hello,
> > fixed the issues:
> > 1. Rising errors when invalid object being transformed.
> > 2. We don't rise the exception when transforming range(3) only in
> > python 2. In third one it is an error.
> > 
> > Please, find the 4-th version of the patch in attachments to this
> > message.  
> 
> Why not make this part of the plpythonu extension?  It doesn't have to
> be a separate contrib module.
> 

Hello,
I thought about that, but the problem is that there will be no
possibilities to create custom transform if we create this extension by
default. For example, it is easy to check if we install this extension
and try to create new transform:

# create extension jsonb_plperl cascade;
NOTICE:  installing required extension "plperl"
CREATE EXTENSION

# CREATE TRANSFORM FOR jsonb LANGUAGE plperl (
#     FROM SQL WITH FUNCTION jsonb_to_plperl(internal),
#     TO SQL WITH FUNCTION plperl_to_jsonb(internal)
# );
2017-12-11 10:23:07.507 MSK [19149] ERROR:  transform for type jsonb
language "plperl" already exists 2017-12-11 10:23:07.507 MSK [19149]
STATEMENT:  CREATE TRANSFORM FOR jsonb LANGUAGE plperl ( FROM SQL WITH
FUNCTION jsonb_to_plperl(internal), TO SQL WITH FUNCTION
plperl_to_jsonb(internal) );
ERROR:  transform for type jsonb language "plperl" already exists

Other types of transforms may be interesting for people when they want
to transform the jsonb to certain structures. For example, what if the
user wants the parameter to be always array inside the function, while
this extension can return integers or strings in some cases.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 12/11/17 03:22, Anthony Bykov wrote:
>> Why not make this part of the plpythonu extension?  It doesn't have to
>> be a separate contrib module.
>>
> Hello,
> I thought about that, but the problem is that there will be no
> possibilities to create custom transform if we create this extension by
> default.

OK, could it be a separate extension, but part of the same code directory?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 12/11/17 03:22, Anthony Bykov wrote:
>>> Why not make this part of the plpythonu extension?  It doesn't have to
>>> be a separate contrib module.

>> I thought about that, but the problem is that there will be no
>> possibilities to create custom transform if we create this extension by
>> default.

> OK, could it be a separate extension, but part of the same code directory?

I think our makefile infrastructure only allows for one shlib to be
built per directory.

Admittedly, you could have two extensions sharing the same shlib
(and the same regression test suite), but on the whole it's not clear
to me why we should do that.

            regards, tom lane


Re: Jsonb transform for pl/python

От
Thomas Munro
Дата:
On Thu, Dec 7, 2017 at 12:40 AM, Anthony Bykov <a.bykov@postgrespro.ru> wrote:
> Hello,
> fixed the issues:
> 1. Rising errors when invalid object being transformed.
> 2. We don't rise the exception when transforming range(3) only in
> python 2. In third one it is an error.
>
> Please, find the 4-th version of the patch in attachments to this
> message. --

Hi Anthony,

FYI make docs fails:

json.sgml:584: parser error : Opening and ending tag mismatch: xref
line 581 and para
  </para>
         ^
json.sgml:585: parser error : Opening and ending tag mismatch: para
line 575 and sect2
 </sect2>
         ^
json.sgml:588: parser error : Opening and ending tag mismatch: sect2
line 572 and sect1
</sect1>
        ^
json.sgml:589: parser error : Premature end of data in tag sect1 line 3
json.sgml:589: parser error : chunk is not well balanced
datatype.sgml:4354: parser error : Failure to process entity json
  &json;
        ^
datatype.sgml:4354: parser error : Entity 'json' not defined
  &json;
        ^
datatype.sgml:4955: parser error : chunk is not well balanced
postgres.sgml:104: parser error : Failure to process entity datatype
  &datatype;
            ^
postgres.sgml:104: parser error : Entity 'datatype' not defined
  &datatype;
            ^

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Jsonb transform for pl/python

От
Anthony Bykov
Дата:
On Fri, 12 Jan 2018 13:33:56 +1300
Thomas Munro <thomas.munro@enterprisedb.com> wrote:

> On Thu, Dec 7, 2017 at 12:40 AM, Anthony Bykov
> <a.bykov@postgrespro.ru> wrote:
> > Hello,
> > fixed the issues:
> > 1. Rising errors when invalid object being transformed.
> > 2. We don't rise the exception when transforming range(3) only in
> > python 2. In third one it is an error.
> >
> > Please, find the 4-th version of the patch in attachments to this
> > message. --  
> 
> Hi Anthony,
> 
> FYI make docs fails:
> 
> json.sgml:584: parser error : Opening and ending tag mismatch: xref
> line 581 and para
>   </para>
>          ^
> json.sgml:585: parser error : Opening and ending tag mismatch: para
> line 575 and sect2
>  </sect2>
>          ^
> json.sgml:588: parser error : Opening and ending tag mismatch: sect2
> line 572 and sect1
> </sect1>
>         ^
> json.sgml:589: parser error : Premature end of data in tag sect1 line
> 3 json.sgml:589: parser error : chunk is not well balanced
> datatype.sgml:4354: parser error : Failure to process entity json
>   &json;
>         ^
> datatype.sgml:4354: parser error : Entity 'json' not defined
>   &json;
>         ^
> datatype.sgml:4955: parser error : chunk is not well balanced
> postgres.sgml:104: parser error : Failure to process entity datatype
>   &datatype;
>             ^
> postgres.sgml:104: parser error : Entity 'datatype' not defined
>   &datatype;
>             ^
> 

Hello, thank you for your message.

Fixed it. Here is a new version of the patch.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Jsonb transform for pl/python

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

LGTM.

The new status of this patch is: Ready for Committer

Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 1/12/18 10:43, Aleksander Alekseev wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:       tested, passed
> Spec compliant:           tested, passed
> Documentation:            tested, passed
> 
> LGTM.
> 
> The new status of this patch is: Ready for Committer

I've been working on polishing this a bit.  I'll keep working on it.  It
should be ready to commit soon.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Nikita Glukhov
Дата:
On 01.02.2018 19:06, Peter Eisentraut wrote:
> On 1/12/18 10:43, Aleksander Alekseev wrote:
>> The following review has been posted through the commitfest application:
>> make installcheck-world:  tested, passed
>> Implements feature:       tested, passed
>> Spec compliant:           tested, passed
>> Documentation:            tested, passed
>>
>> LGTM.
>>
>> The new status of this patch is: Ready for Committer
> I've been working on polishing this a bit.  I'll keep working on it.  It
> should be ready to commit soon.

Hi.

I have reviewed this patch.  Attached new 6th version of the patch with
v5-v6 delta-patch.

  * Added out of memory checks after the following function calls:
    - PyList_New()
    - PyDict_New()
    - PyString_FromStringAndSize()  (added PyString_FromJsonbValue())

  * Added Py_XDECREF() for key-value pairs and list elements after theirs
    appending because PyDict_SetItem() and PyList_Append() do not steal
    references (see also hstore_plpython code).

  * Removed unnecessary JsonbValue heap-allocations in PyObject_ToJsonbValue().

  * Added iterating to the end of iterator in PyObject_FromJsonb() for correct
    freeing of JsonbIterators.

  * Passed JsonbParseState ** to PyXxx_ToJsonbValue() functions.

  * Added transformation of Python tuples into JSON arrays because standard
    Python JSONEncoder encoder does the same.
    (See https://docs.python.org/2/library/json.html#py-to-json-table)


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 3/12/18 11:26, Nikita Glukhov wrote:
> I have reviewed this patch.  Attached new 6th version of the patch with
> v5-v6 delta-patch.

Thanks for the update.  I'm working on committing this.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 3/21/18 18:50, Peter Eisentraut wrote:
> On 3/12/18 11:26, Nikita Glukhov wrote:
>> I have reviewed this patch.  Attached new 6th version of the patch with
>> v5-v6 delta-patch.
> 
> Thanks for the update.  I'm working on committing this.

Committed.

Everything seemed to work well.  I just did a bit of cosmetic cleanups.
I did a fair amount of tweaking on the naming of functions, the
extensions and library names, etc., to make it look like the existing
plpython transforms.  I also changed it so that the transform would
support mappings and sequences other than dict and list.  I removed the
non-ASCII test and the test with the huge numbers.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Nikita Glukhov
Дата:
On 28.03.2018 15:43, Peter Eisentraut wrote:

> On 3/21/18 18:50, Peter Eisentraut wrote:
>> On 3/12/18 11:26, Nikita Glukhov wrote:
>>> I have reviewed this patch.  Attached new 6th version of the patch with
>>> v5-v6 delta-patch.
>> Thanks for the update.  I'm working on committing this.
> Committed.
>
> Everything seemed to work well.  I just did a bit of cosmetic cleanups.
> I did a fair amount of tweaking on the naming of functions, the
> extensions and library names, etc., to make it look like the existing
> plpython transforms.  I also changed it so that the transform would
> support mappings and sequences other than dict and list.  I removed the
> non-ASCII test and the test with the huge numbers.


I found a memory leak in PLySequence_ToJsonbValue():
PyObject returned from PySequence_GetItem() is not released.

A bug can be easily reproduced using function roudtrip() from regression test:
SELECT roundtrip('[1,2,3]'::jsonb) FROM generate_series(1, 1000000);

Similar code in PLyMapping_ToJsonbValue() seems to be correct because
PyList_GetItem() and PyTuple_GetItem() return a borrowed reference.

Patch with fix is attached.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

Re: Jsonb transform for pl/python

От
Alexander Korotkov
Дата:
Hi!

On Fri, Jun 15, 2018 at 2:11 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> On 28.03.2018 15:43, Peter Eisentraut wrote:
>
> > On 3/21/18 18:50, Peter Eisentraut wrote:
> >> On 3/12/18 11:26, Nikita Glukhov wrote:
> >>> I have reviewed this patch.  Attached new 6th version of the patch with
> >>> v5-v6 delta-patch.
> >> Thanks for the update.  I'm working on committing this.
> > Committed.
> >
> > Everything seemed to work well.  I just did a bit of cosmetic cleanups.
> > I did a fair amount of tweaking on the naming of functions, the
> > extensions and library names, etc., to make it look like the existing
> > plpython transforms.  I also changed it so that the transform would
> > support mappings and sequences other than dict and list.  I removed the
> > non-ASCII test and the test with the huge numbers.
>
>
> I found a memory leak in PLySequence_ToJsonbValue():
> PyObject returned from PySequence_GetItem() is not released.
>
> A bug can be easily reproduced using function roudtrip() from regression test:
> SELECT roundtrip('[1,2,3]'::jsonb) FROM generate_series(1, 1000000);
>
> Similar code in PLyMapping_ToJsonbValue() seems to be correct because
> PyList_GetItem() and PyTuple_GetItem() return a borrowed reference.
>
> Patch with fix is attached.

I'm going to check and commit this if everything is OK.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Jsonb transform for pl/python

От
Nikita Glukhov
Дата:
Hi!

We have found another performance problem in this transform -- very slow
conversion via I/O from PostgreSQL numerics (which are used for the
representation of jsonb numbers) to Python Decimals.


Attached patch with fix.

We are simply trying first to convert numeric to int64 if is does not have
digits after the decimal point, and then construct Python Int instead of
Decimal.  Standard Python json.loads() does the same for exact integers.

A special function numeric_to_exact_int64() was added to numeric.c.  Existing
numeric_int8() can't be used here because it rounds input numeric.



Performance results (see the second attached file jsonb_plplython_tests.sql
for the function definitions):

  - calculating the length of the passed jsonb object (input transformation):

    py_jsonb_length_trans opt            2761,873 ms
    py_jsonb_length_trans               10419,230 ms
    py_jsonb_length_json                 8691,201 ms

  - returning integer arrays (output transformation):

    py_jsonb_ret_int_array_trans opt     3284,810 ms
    py_jsonb_ret_int_array_trans         4540,063 ms
    py_jsonb_ret_int_array_raw           5100,793 ms
    py_jsonb_ret_int_array_json          9887,821 ms

  - returning float arrays (output transformation):

    py_jsonb_ret_float_array_trans opt   5699,360 ms
    py_jsonb_ret_float_array_trans       5735,854 ms
    py_jsonb_ret_float_array_raw         6516,514 ms
    py_jsonb_ret_float_array_json       10869,213 ms


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 6/23/18 01:44, Nikita Glukhov wrote:
> We are simply trying first to convert numeric to int64 if is does not have
> digits after the decimal point, and then construct Python Int instead of
> Decimal.  Standard Python json.loads() does the same for exact integers.

We just had a very similar but not identical discussion in the thread
about the PL/Perl transforms, where we rejected the proposal.  Other
comments on this one?

In any case, this might well be a consideration for PG12, not a bug in
the existing implementation.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Jsonb transform for pl/python

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 6/23/18 01:44, Nikita Glukhov wrote:
>> We are simply trying first to convert numeric to int64 if is does not have
>> digits after the decimal point, and then construct Python Int instead of
>> Decimal.  Standard Python json.loads() does the same for exact integers.

> We just had a very similar but not identical discussion in the thread
> about the PL/Perl transforms, where we rejected the proposal.  Other
> comments on this one?

I can sympathize with the speed concern, but the proposed patch produces
a functional change (ie, you get a different kind of Python object, with
different behaviors, in some cases).  That seems to me to be a good reason
to reject it.  The fact that it makes the behavior vary depending on the
local width of "long" is also a problem, although I think that could be
fixed.

More generally, I'd be happier with a patch that sped things up for
non-integers too.  I don't know whether Python exposes enough internals
of type Decimal to make that practical, though.  One idea for doing it at
arm's length is to compute the Decimal value using numeric-digit-at-a-time
arithmetic, roughly

    x := 0;
    foreach(digit, numeric)
        x.fma(10000, digit);
    x.scaleb(appropriate-exponent);

In principle that's probably faster than string conversion, but not
sure by how much.

            regards, tom lane


Re: Jsonb transform for pl/python

От
Alexander Korotkov
Дата:
On Wed, Jul 11, 2018 at 12:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > On 6/23/18 01:44, Nikita Glukhov wrote:
> >> We are simply trying first to convert numeric to int64 if is does not have
> >> digits after the decimal point, and then construct Python Int instead of
> >> Decimal.  Standard Python json.loads() does the same for exact integers.
>
> > We just had a very similar but not identical discussion in the thread
> > about the PL/Perl transforms, where we rejected the proposal.  Other
> > comments on this one?
>
> I can sympathize with the speed concern, but the proposed patch produces
> a functional change (ie, you get a different kind of Python object, with
> different behaviors, in some cases).  That seems to me to be a good reason
> to reject it.

Nevertheless, as Nikita pointed, json.loads() performs the same in
Python.  See an example for python 2.7.10.

>>> import json
>>> type(json.loads('1'))
<type 'int'>
>>> type(json.loads('1.0'))
<type 'float'>

So, from postgres point of view this behavior is wrong.  But on
another hand why don't let python transform to behave like a python?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Jsonb transform for pl/python

От
Nikita Glukhov
Дата:
Working on the new lazy transform for jsonb I found another memory leak in
PLyObject_ToJsonbValue(): palloc() for output boolean JsonbValue is unnecessary,
'out' variable is already initialized.

Fix is attached.

On 15.06.2018 14:42, Alexander Korotkov wrote:
> Hi!
>
> On Fri, Jun 15, 2018 at 2:11 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> I found a memory leak in PLySequence_ToJsonbValue():
> PyObject returned from PySequence_GetItem() is not released.
>
> A bug can be easily reproduced using function roudtrip() from regression test:
> SELECT roundtrip('[1,2,3]'::jsonb) FROM generate_series(1, 1000000);
>
> Similar code in PLyMapping_ToJsonbValue() seems to be correct because
> PyList_GetItem() and PyTuple_GetItem() return a borrowed reference.
>
> Patch with fix is attached.
> I'm going to check and commit this if everything is OK.
>
> ------
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Jsonb transform for pl/python

От
Peter Eisentraut
Дата:
On 27/09/2018 16:58, Nikita Glukhov wrote:
> Working on the new lazy transform for jsonb I found another memory leak in
> PLyObject_ToJsonbValue(): palloc() for output boolean JsonbValue is unnecessary,
> 'out' variable is already initialized.
> 
> Fix is attached.

Committed, thanks.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services