Re: [PATCH] Fix conversion for Decimal arguments in plpython functions

Поиск
Список
Период
Сортировка
От Szymon Guz
Тема Re: [PATCH] Fix conversion for Decimal arguments in plpython functions
Дата
Msg-id CAFjNrYuZL74uRAey2_7Dk3CYNfhO7mzS3a_8cPYntds3=V1HrA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Fix conversion for Decimal arguments in plpython functions  (Steve Singer <steve@ssinger.info>)
Список pgsql-hackers
On 25 June 2013 05:16, Steve Singer <steve@ssinger.info> wrote:

One concern I have is that this patch makes pl/python functions involving numerics more than 3 times as slow as before.


create temp table b(a numeric);
insert into b select generate_series(1,10000);

create or replace function x(a numeric,b numeric) returns numeric as $$
if a==None:
  return b
return a+b
$$ language plpythonu;
create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);


test=# select sm(a) from b;
    sm
----------
 50005000
(1 row)

Time: 565.650 ms

versus before the patch this was taking in the range of 80ms.

Would it be faster to call numeric_send instead of numeric_out and then convert the sequence of Int16's to a tuple of digits that can be passed into the Decimal constructor? I think this is worth trying and testing,


Hi,
thanks for all the remarks.

I think I cannot do anything about speeding up the code. What I've found so far is:

I cannot use simple fields from NumericVar in my code, so to not waste time on something not sensible, I've tried to found out if using the tuple constructor for decimal.Decimal will be faster. I've changed the function to something like this:

static PyObject *
PLyDecimal_FromNumeric(PLyDatumToOb *arg, Datum d)
{       
        PyObject *digits = PyTuple_New(4);
        PyTuple_SetItem(digits, 0, PyInt_FromLong(1));
        PyTuple_SetItem(digits, 1, PyInt_FromLong(4));
        PyTuple_SetItem(digits, 2, PyInt_FromLong(1));
        PyTuple_SetItem(digits, 3, PyInt_FromLong(4));

        PyObject *tuple = PyTuple_New(3);
        PyTuple_SetItem(tuple, 0, PyInt_FromLong(1));
        PyTuple_SetItem(tuple, 1, digits);
        PyTuple_SetItem(tuple, 2, PyInt_FromLong(-3));
       
        value = PyObject_CallFunctionObjArgs(PLy_decimal_ctor_global, tuple, NULL);

        return value;
}

Yes, it returns the same value regardless the params. The idea is to call Python code like:

Decimal((0, (1,  4, 1, 4), -3))

which is simply:

Decimal('1.414')

Unfortunately this is not faster. It is as slow as it was with string constructor.

I've checked the speed of decimal.Decimal using pure python. For this I used a simple function, similar to yours:


def x(a, b):
    if a is None:
        return b
    return a + b

I've run the tests using simple ints:


def test():
    a = 0
    for i in xrange(0, 10000):
        a += x(a, i)


for a in xrange(1, 100):
    test()


And later I've run the same function, but with converting the arguments to Decimals:

from decimal import Decimal


def x(a, b):
    if a is None:
        return b
    return a + b


def test():
    a = 0
    for i in xrange(0, 10000):
        a += x(Decimal(a), Decimal(i))


for a in xrange(1, 100):
    test()

It was run 100 times for decreasing the impact of test initialization.

The results for both files are:
int:  0.697s
decimal: 38.859s

What gives average time for one function call of:
int: 69ms
decimal: 380ms


For me the problem is with slow code at Python's side, the Decimal constructors are pretty slow, and there is nothing I can do with that at the Postgres' side.


I will send patch with fixes later.


thanks,
Szymon

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: PostgreSQL 9.3 latest dev snapshot
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: enable new error fields in plpgsql (9.4)