Обсуждение: Why overhead of SPI is so large?

Поиск
Список
Период
Сортировка

Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:
Hi, hackers.

One of our customers complains about slow execution of PL/pgSQL 
functions comparing with Oracle.
So he wants to compile PL/pgSQL functions (most likely just-in-time 
compilation).
Certainly interpreter adds quite large overhead comparing with native 
code (~10 times) but
most of PL/pgSQL functions are just running some SQL queues and 
iterating through results.

I can not believe that JIT can significantly speed-up such functions.
So I decided to make simple experiment:  I created large enough table 
and implemented functions
which calculates norm of one column in different languages.

Results are frustrating (at least for me):

PL/pgSQL:   29044.361 ms
C/SPI:          22785.597 ms
С/coreAPI:     2873.072 ms
PL/Lua:        33445.520 ms
SQL:              7397.639 ms (with parallel execution disabled)

The fact that difference between PL/pgSQL and function implemented in C 
using SPI is not so large  was expected by me.
But why it is more than 3 time slower than correspondent SQL query?
The answer seems to be in the third result: the same function in C 
implemented without SPI (usign table_beginscan/heap_getnext)
Looks like SPI adds quite significant overhead.
And as far as almost all PL languages are using SPI, them all suffer 
from it.

Below is profile of SPI function execution:

   9.47%  postgres  libc-2.23.so       [.] __memcpy_avx_unaligned
    9.19%  postgres  spitest.so         [.] spi_norm
    8.09%  postgres  postgres           [.] AllocSetAlloc
    4.50%  postgres  postgres           [.] tts_buffer_heap_getsomeattrs
    4.36%  postgres  postgres           [.] heap_form_tuple
    3.41%  postgres  postgres           [.] standard_ExecutorRun
    3.35%  postgres  postgres           [.] ExecScan
    3.31%  postgres  postgres           [.] palloc0
    2.41%  postgres  postgres           [.] heapgettup_pagemode
    2.40%  postgres  postgres           [.] AllocSetReset
    2.25%  postgres  postgres           [.] PopActiveSnapshot
    2.17%  postgres  postgres           [.] PortalRunFetch
    2.16%  postgres  postgres           [.] HeapTupleSatisfiesVisibility
    1.97%  postgres  libc-2.23.so       [.] __sigsetjmp
    1.90%  postgres  postgres           [.] _SPI_cursor_operation
    1.87%  postgres  postgres           [.] AllocSetFree
    1.86%  postgres  postgres           [.] PortalRunSelect
    1.79%  postgres  postgres           [.] heap_getnextslot
    1.75%  postgres  postgres           [.] heap_fill_tuple
    1.70%  postgres  postgres           [.] spi_dest_startup
    1.50%  postgres  postgres           [.] spi_printtup
    1.49%  postgres  postgres           [.] nocachegetattr
    1.45%  postgres  postgres           [.] MemoryContextDelete
    1.44%  postgres  postgres           [.] ExecJustAssignScanVar
    1.38%  postgres  postgres           [.] CreateTupleDescCopy
    1.33%  postgres  postgres           [.] SPI_getbinval
    1.30%  postgres  postgres           [.] PushActiveSnapshot
    1.30%  postgres  postgres           [.] AllocSetContextCreateInternal
    1.22%  postgres  postgres           [.] heap_compute_data_size
    1.22%  postgres  postgres           [.] MemoryContextCreate
    1.14%  postgres  postgres           [.] heapgetpage
    1.05%  postgres  postgres           [.] palloc
    1.03%  postgres  postgres           [.] SeqNext

As you can see, most of the time is spent in allocation and copying memory.
I wonder if somebody tried to address this problem and are there some 
plans for improving speed of PL/pgSQL and other
stored languages?

I attached to this mail sources of spi_test extension with my experiments.
Please build it and run norm.sql file.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

RE: Why overhead of SPI is so large?

От
"Tsunakawa, Takayuki"
Дата:
From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
> PL/pgSQL:   29044.361 ms
> C/SPI:          22785.597 ms
> 
> The fact that difference between PL/pgSQL and function implemented in C
> using SPI is not so large  was expected by me.

This PL/pgSQL overhead is not so significant compared with the three times, but makes me desire some feature like
Oracle'sALTER PROCEDURE ... COMPILE; that compiles the PL/SQL logic to native code.  I've seen a few dozen percent
speedup.
 


Regards
Takayuki Tsunakawa


Re: Why overhead of SPI is so large?

От
Kyotaro Horiguchi
Дата:
Hello.

At Wed, 21 Aug 2019 19:41:08 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
<ed9da20e-01aa-d04b-d085-e6c16b14b9d7@postgrespro.ru>
> Hi, hackers.
> 
> One of our customers complains about slow execution of PL/pgSQL
> functions comparing with Oracle.
> So he wants to compile PL/pgSQL functions (most likely just-in-time
> compilation).
> Certainly interpreter adds quite large overhead comparing with native
> code (~10 times) but
> most of PL/pgSQL functions are just running some SQL queues and
> iterating through results.
> 
> I can not believe that JIT can significantly speed-up such functions.
> So I decided to make simple experiment:  I created large enough table
> and implemented functions
> which calculates norm of one column in different languages.
> 
> Results are frustrating (at least for me):
> 
> PL/pgSQL:   29044.361 ms
> C/SPI:          22785.597 ms
> С/coreAPI:     2873.072 ms
> PL/Lua:        33445.520 ms
> SQL:              7397.639 ms (with parallel execution disabled)
> 
> The fact that difference between PL/pgSQL and function implemented in
> C using SPI is not so large  was expected by me.
> But why it is more than 3 time slower than correspondent SQL query?
> The answer seems to be in the third result: the same function in C
> implemented without SPI (usign table_beginscan/heap_getnext)
> Looks like SPI adds quite significant overhead.
> And as far as almost all PL languages are using SPI, them all suffer
> from it.

As far as looking the attached spitest.c, it seems that the
overhead comes from cursor operation, not from SPI. As far as
spitest.c goes, cursor is useless.  "SQL" and C/coreAPI seem to
be scanning over the result from *a single* query. If that's
correct, why don't you use SPI_execute() then scan over
SPI_tuptable?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:

On 22.08.2019 5:40, Kyotaro Horiguchi wrote:
> Hello.
>
> At Wed, 21 Aug 2019 19:41:08 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
<ed9da20e-01aa-d04b-d085-e6c16b14b9d7@postgrespro.ru>
>> Hi, hackers.
>>
>> One of our customers complains about slow execution of PL/pgSQL
>> functions comparing with Oracle.
>> So he wants to compile PL/pgSQL functions (most likely just-in-time
>> compilation).
>> Certainly interpreter adds quite large overhead comparing with native
>> code (~10 times) but
>> most of PL/pgSQL functions are just running some SQL queues and
>> iterating through results.
>>
>> I can not believe that JIT can significantly speed-up such functions.
>> So I decided to make simple experiment:  I created large enough table
>> and implemented functions
>> which calculates norm of one column in different languages.
>>
>> Results are frustrating (at least for me):
>>
>> PL/pgSQL:   29044.361 ms
>> C/SPI:          22785.597 ms
>> С/coreAPI:     2873.072 ms
>> PL/Lua:        33445.520 ms
>> SQL:              7397.639 ms (with parallel execution disabled)
>>
>> The fact that difference between PL/pgSQL and function implemented in
>> C using SPI is not so large  was expected by me.
>> But why it is more than 3 time slower than correspondent SQL query?
>> The answer seems to be in the third result: the same function in C
>> implemented without SPI (usign table_beginscan/heap_getnext)
>> Looks like SPI adds quite significant overhead.
>> And as far as almost all PL languages are using SPI, them all suffer
>> from it.
> As far as looking the attached spitest.c, it seems that the
> overhead comes from cursor operation, not from SPI. As far as
> spitest.c goes, cursor is useless.  "SQL" and C/coreAPI seem to
> be scanning over the result from *a single* query. If that's
> correct, why don't you use SPI_execute() then scan over
> SPI_tuptable?

Scanned table is very large and doesn't fir in memory.
This is why I am using SPI cursors.
Please let me know if there is more efficient way to traverse larger 
table using SPI.


>
> regards.
>

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 22.08.2019 3:27, Tsunakawa, Takayuki wrote:
From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
PL/pgSQL:   29044.361 ms
C/SPI:          22785.597 ms

The fact that difference between PL/pgSQL and function implemented in C
using SPI is not so large  was expected by me.
This PL/pgSQL overhead is not so significant compared with the three times, but makes me desire some feature like Oracle's ALTER PROCEDURE ... COMPILE; that compiles the PL/SQL logic to native code.  I've seen a few dozen percent speed up.


Actually my implementation of C/SPI version is not optimal: it is better to fetch several records:

    while (true)
    {
        SPI_cursor_fetch(portal, true, 100);
        if (SPI_processed) {
            for (i = 0; i < SPI_processed; i++) {
                HeapTuple spi_tuple = SPI_tuptable->vals[i];
                Datum val = SPI_getbinval(spi_tuple, SPI_tuptable->tupdesc, 1, &is_null);
                double x = DatumGetFloat8(val);
                result += x*x;
                SPI_freetuple(spi_tuple);
            }
            SPI_freetuptable(SPI_tuptable);
        } else
            break;
    }

This version shows result 9405.694 ms which is comparable with result of SQL query.
Unfortunately (or fortunately) PL/pgSQL is already using prefetch. If it is disables (when iterate through explicitly created cursor), time of query execution is increased almost twice (46552.935 ms)

So PL/SPI ratio is more than three times.

Updatede results are the following:


Impementation
time (ms)
PL/Lua32220
PL/pgSQL29044
PL/pgSQL (JIT)
27594
C/SPI  9406
SQL  7399
SQL (JIT) 
  5532
С/coreAPI  2873
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

Implementation
time (ms)
PL/v8
41550
PL/Lua32220
PL/pgSQL19808
C/SPI  9406
SQL  7399
SQL (JIT) 
  5532
С/coreAPI  2873
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

I have a plan to do some work in this direction. Snapshot is not necessary for almost buildin functions. If expr calls only buildin functions, then probably can be called without snapshot and without any work with plan cache.

Pavel

Implementation
time (ms)
PL/v8
41550
PL/Lua32220
PL/pgSQL19808
C/SPI  9406
SQL  7399
SQL (JIT) 
  5532
С/coreAPI  2873
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 22.08.2019 18:56, Pavel Stehule wrote:


čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

I have a plan to do some work in this direction. Snapshot is not necessary for almost buildin functions. If expr calls only buildin functions, then probably can be called without snapshot and without any work with plan cache.


I wonder if the following simple patch is correct?


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 22.08.2019 18:56, Pavel Stehule wrote:


čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

I have a plan to do some work in this direction. Snapshot is not necessary for almost buildin functions. If expr calls only buildin functions, then probably can be called without snapshot and without any work with plan cache.


I wonder if the following simple patch is correct?

You cannot to believe to user defined functions so immutable flag is correct. Only buildin functions are 100% correct.

CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;

is working.

But your patch is good enough for benchmarking.

Pavel




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 23.08.2019 12:10, Pavel Stehule wrote:


pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 22.08.2019 18:56, Pavel Stehule wrote:


čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

I have a plan to do some work in this direction. Snapshot is not necessary for almost buildin functions. If expr calls only buildin functions, then probably can be called without snapshot and without any work with plan cache.


I wonder if the following simple patch is correct?

You cannot to believe to user defined functions so immutable flag is correct. Only buildin functions are 100% correct.

CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;

is working.

But such definition of the function contradicts IMMUTABLE contract, doesn't it?
If creator of the function incorrectly classify it, then usage of such function can cause incorrect behavior.
For example, if function is marked as "parallel safe" but actually it is not parallel safe, then using it in parallel plan may cause incorrect results.
But it is a reason for disabling parallel plans for all user defined functions, isn't it?

Also nothing terrible will happen in any case. If expression is calling function which is marked is immutable but actually is not,  then we can just get old (deteriorated)
result of expression. Right now, if caller function (one containing evaluated expression) is marked as non-volatile, then snapshot is also not taken.
So if such function expression is calling foo() function as declared above, then results will be also incorrect.
So I do not think some principle difference here and do not understand why we should not believe user (function creator) only in this case.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 23. 8. 2019 v 13:21 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 23.08.2019 12:10, Pavel Stehule wrote:


pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 22.08.2019 18:56, Pavel Stehule wrote:


čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself is actually immutable  and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.

I have a plan to do some work in this direction. Snapshot is not necessary for almost buildin functions. If expr calls only buildin functions, then probably can be called without snapshot and without any work with plan cache.


I wonder if the following simple patch is correct?

You cannot to believe to user defined functions so immutable flag is correct. Only buildin functions are 100% correct.

CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;

is working.

But such definition of the function contradicts IMMUTABLE contract, doesn't it?
If creator of the function incorrectly classify it, then usage of such function can cause incorrect behavior.
For example, if function is marked as "parallel safe" but actually it is not parallel safe, then using it in parallel plan may cause incorrect results.
But it is a reason for disabling parallel plans for all user defined functions, isn't it?

In reality it is not IMMUTABLE function. On second hand, there are lot of application that depends on this behave.

It is well know trick how to reduce estimation errors related to JOINs. When immutable function has constant parameters, then it is evaluated in planning time.

So sometimes was necessary to use

SELECT ... FROM tab WHERE foreign_key = immutable_function('constant parameter')

instead JOIN.

It is ugly, but it is working perfectly. I think so until we will have multi table statistics, this behave should be available in Postgres.

Sure, this function should not be used for functional indexes.
 

Also nothing terrible will happen in any case. If expression is calling function which is marked is immutable but actually is not,  then we can just get old (deteriorated)
result of expression. Right now, if caller function (one containing evaluated expression) is marked as non-volatile, then snapshot is also not taken.
So if such function expression is calling foo() function as declared above, then results will be also incorrect.
So I do not think some principle difference here and do not understand why we should not believe user (function creator) only in this case.


 


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 23.08.2019 14:42, Pavel Stehule wrote:

In reality it is not IMMUTABLE function. On second hand, there are lot of application that depends on this behave.

It is well know trick how to reduce estimation errors related to JOINs. When immutable function has constant parameters, then it is evaluated in planning time.

So sometimes was necessary to use

SELECT ... FROM tab WHERE foreign_key = immutable_function('constant parameter')

instead JOIN.

It is ugly, but it is working perfectly. I think so until we will have multi table statistics, this behave should be available in Postgres.

Sure, this function should not be used for functional indexes.
 

What about the following version of the patch?



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: Why overhead of SPI is so large?

От
David Fetter
Дата:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
> pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
> 
> >
> >
> > On 22.08.2019 18:56, Pavel Stehule wrote:
> >
> >
> >
> > čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
> > k.knizhnik@postgrespro.ru> napsal:
> >
> >> Some more information...
> >> First of all I found out that marking PL/pgSQL function as immutable
> >> significantly increase speed of its execution:
> >> 19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
> >> snapshot if function is volatile (default).
> >> I wonder if PL/pgSQL compiler can detect that evaluated expression itself
> >> is actually immutable  and there is no need to take snapshot
> >> for each invocation of this function. Also I have tried yet another PL
> >> language - JavaScript, which is now new outsider, despite to the fact that
> >> v8 JIT compiler is very good.
> >>
> >
> > I have a plan to do some work in this direction. Snapshot is not necessary
> > for almost buildin functions. If expr calls only buildin functions, then
> > probably can be called without snapshot and without any work with plan
> > cache.
> >
> >
> > I wonder if the following simple patch is correct?
> >
> 
> You cannot to believe to user defined functions so immutable flag is
> correct. Only buildin functions are 100% correct.
> 
> CREATE OR REPLACE FUNCTION foo()
> RETURNS int AS $$
> SELECT count(*) FROM pg_class;
> $$ LANGUAGE sql IMMUTABLE;
> 
> is working.

No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
> pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 22.08.2019 18:56, Pavel Stehule wrote:
> >
> >
> >
> > čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
> > k.knizhnik@postgrespro.ru> napsal:
> >
> >> Some more information...
> >> First of all I found out that marking PL/pgSQL function as immutable
> >> significantly increase speed of its execution:
> >> 19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
> >> snapshot if function is volatile (default).
> >> I wonder if PL/pgSQL compiler can detect that evaluated expression itself
> >> is actually immutable  and there is no need to take snapshot
> >> for each invocation of this function. Also I have tried yet another PL
> >> language - JavaScript, which is now new outsider, despite to the fact that
> >> v8 JIT compiler is very good.
> >>
> >
> > I have a plan to do some work in this direction. Snapshot is not necessary
> > for almost buildin functions. If expr calls only buildin functions, then
> > probably can be called without snapshot and without any work with plan
> > cache.
> >
> >
> > I wonder if the following simple patch is correct?
> >
>
> You cannot to believe to user defined functions so immutable flag is
> correct. Only buildin functions are 100% correct.
>
> CREATE OR REPLACE FUNCTION foo()
> RETURNS int AS $$
> SELECT count(*) FROM pg_class;
> $$ LANGUAGE sql IMMUTABLE;
>
> is working.

No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.

I have not any problem with fixing this behave when there will be any alternative.

I can imagine new special flag that can be used for STABLE functions, that enforce one shot plans and can be optimized similar like IMMUTABLE functions now - using result in planning time.

The users lie because they must - there is not a alternative. There is not any other solution - and estimation errors related to a joins are fundamental issue.

Regards

Pavel





Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Why overhead of SPI is so large?

От
Robert Haas
Дата:
On Sat, Aug 24, 2019 at 12:01 PM David Fetter <david@fetter.org> wrote:
> No, it's lying to the RDBMS, so it's pilot error. The problem of
> determining from the function itself whether it is in fact immutable
> is, in general, equivalent to the Halting Problem, so no, we can't
> figure it out. We do need to trust our users not to lie to us, and we
> do not need to protect them from the consequences when they do.

Depends.  I don't mind if mislabeling a function leads to "wrong"
query results, but I don't think it's OK for it to, say, crash the
server.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 24.08.2019 19:13, Pavel Stehule wrote:


so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
> pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 22.08.2019 18:56, Pavel Stehule wrote:
> >
> >
> >
> > čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
> > k.knizhnik@postgrespro.ru> napsal:
> >
> >> Some more information...
> >> First of all I found out that marking PL/pgSQL function as immutable
> >> significantly increase speed of its execution:
> >> 19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
> >> snapshot if function is volatile (default).
> >> I wonder if PL/pgSQL compiler can detect that evaluated expression itself
> >> is actually immutable  and there is no need to take snapshot
> >> for each invocation of this function. Also I have tried yet another PL
> >> language - JavaScript, which is now new outsider, despite to the fact that
> >> v8 JIT compiler is very good.
> >>
> >
> > I have a plan to do some work in this direction. Snapshot is not necessary
> > for almost buildin functions. If expr calls only buildin functions, then
> > probably can be called without snapshot and without any work with plan
> > cache.
> >
> >
> > I wonder if the following simple patch is correct?
> >
>
> You cannot to believe to user defined functions so immutable flag is
> correct. Only buildin functions are 100% correct.
>
> CREATE OR REPLACE FUNCTION foo()
> RETURNS int AS $$
> SELECT count(*) FROM pg_class;
> $$ LANGUAGE sql IMMUTABLE;
>
> is working.

No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.

I have not any problem with fixing this behave when there will be any alternative.

I can imagine new special flag that can be used for STABLE functions, that enforce one shot plans and can be optimized similar like IMMUTABLE functions now - using result in planning time.

The users lie because they must - there is not a alternative. There is not any other solution - and estimation errors related to a joins are fundamental issue.


Pavel, I wonder if I can put my patch (with fix which performs this optimization only for built-in functions) to commitfest or you prefer to do it yourself in some other way and propose your own solution?



Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 13. 9. 2019 v 9:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 24.08.2019 19:13, Pavel Stehule wrote:


so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
> pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 22.08.2019 18:56, Pavel Stehule wrote:
> >
> >
> >
> > čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
> > k.knizhnik@postgrespro.ru> napsal:
> >
> >> Some more information...
> >> First of all I found out that marking PL/pgSQL function as immutable
> >> significantly increase speed of its execution:
> >> 19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
> >> snapshot if function is volatile (default).
> >> I wonder if PL/pgSQL compiler can detect that evaluated expression itself
> >> is actually immutable  and there is no need to take snapshot
> >> for each invocation of this function. Also I have tried yet another PL
> >> language - JavaScript, which is now new outsider, despite to the fact that
> >> v8 JIT compiler is very good.
> >>
> >
> > I have a plan to do some work in this direction. Snapshot is not necessary
> > for almost buildin functions. If expr calls only buildin functions, then
> > probably can be called without snapshot and without any work with plan
> > cache.
> >
> >
> > I wonder if the following simple patch is correct?
> >
>
> You cannot to believe to user defined functions so immutable flag is
> correct. Only buildin functions are 100% correct.
>
> CREATE OR REPLACE FUNCTION foo()
> RETURNS int AS $$
> SELECT count(*) FROM pg_class;
> $$ LANGUAGE sql IMMUTABLE;
>
> is working.

No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.

I have not any problem with fixing this behave when there will be any alternative.

I can imagine new special flag that can be used for STABLE functions, that enforce one shot plans and can be optimized similar like IMMUTABLE functions now - using result in planning time.

The users lie because they must - there is not a alternative. There is not any other solution - and estimation errors related to a joins are fundamental issue.


Pavel, I wonder if I can put my patch (with fix which performs this optimization only for built-in functions) to commitfest or you prefer to do it yourself in some other way and propose your own solution?

I think so your patch is good enough for commitfest.

It doesn't remove all overhead - I think so there is lot of overhead related to plan cache, but it in good direction.

Probably for these expressions is our final target using a cached JIT - but nobody knows when it will be. I'll not have to time for my experiments before October.





Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 13.09.2019 10:16, Pavel Stehule wrote:


pá 13. 9. 2019 v 9:09 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 24.08.2019 19:13, Pavel Stehule wrote:


so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
> pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 22.08.2019 18:56, Pavel Stehule wrote:
> >
> >
> >
> > čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
> > k.knizhnik@postgrespro.ru> napsal:
> >
> >> Some more information...
> >> First of all I found out that marking PL/pgSQL function as immutable
> >> significantly increase speed of its execution:
> >> 19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
> >> snapshot if function is volatile (default).
> >> I wonder if PL/pgSQL compiler can detect that evaluated expression itself
> >> is actually immutable  and there is no need to take snapshot
> >> for each invocation of this function. Also I have tried yet another PL
> >> language - JavaScript, which is now new outsider, despite to the fact that
> >> v8 JIT compiler is very good.
> >>
> >
> > I have a plan to do some work in this direction. Snapshot is not necessary
> > for almost buildin functions. If expr calls only buildin functions, then
> > probably can be called without snapshot and without any work with plan
> > cache.
> >
> >
> > I wonder if the following simple patch is correct?
> >
>
> You cannot to believe to user defined functions so immutable flag is
> correct. Only buildin functions are 100% correct.
>
> CREATE OR REPLACE FUNCTION foo()
> RETURNS int AS $$
> SELECT count(*) FROM pg_class;
> $$ LANGUAGE sql IMMUTABLE;
>
> is working.

No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.

I have not any problem with fixing this behave when there will be any alternative.

I can imagine new special flag that can be used for STABLE functions, that enforce one shot plans and can be optimized similar like IMMUTABLE functions now - using result in planning time.

The users lie because they must - there is not a alternative. There is not any other solution - and estimation errors related to a joins are fundamental issue.


Pavel, I wonder if I can put my patch (with fix which performs this optimization only for built-in functions) to commitfest or you prefer to do it yourself in some other way and propose your own solution?

I think so your patch is good enough for commitfest.

It doesn't remove all overhead - I think so there is lot of overhead related to plan cache, but it in good direction.

Probably for these expressions is our final target using a cached JIT - but nobody knows when it will be. I'll not have to time for my experiments before October.


This is profile of execution of PL/pgSQL function with my patch:

   5.39%  postgres  plpgsql.so         [.] exec_assign_value
   5.10%  postgres  postgres           [.] ExecInterpExpr
   4.70%  postgres  postgres           [.] tts_buffer_heap_getsomeattrs
   4.56%  postgres  plpgsql.so         [.] exec_move_row_from_fields
   3.87%  postgres  postgres           [.] ExecScan
   3.74%  postgres  plpgsql.so         [.] exec_eval_expr
   3.64%  postgres  postgres           [.] heap_form_tuple
   3.13%  postgres  postgres           [.] heap_fill_tuple
   3.07%  postgres  postgres           [.] heapgettup_pagemode
   2.95%  postgres  postgres           [.] heap_deform_tuple
   2.92%  postgres  plpgsql.so         [.] plpgsql_param_eval_var
   2.64%  postgres  postgres           [.] HeapTupleSatisfiesVisibility
   2.61%  postgres  postgres           [.] AcquirePlannerLocks
   2.58%  postgres  postgres           [.] AcquireExecutorLocks
   2.43%  postgres  postgres           [.] GetCachedPlan
   2.26%  postgres  plpgsql.so         [.] exec_stmt
   2.23%  postgres  plpgsql.so         [.] exec_cast_value
   1.89%  postgres  postgres           [.] AllocSetAlloc
   1.75%  postgres  postgres           [.] palloc0
   1.73%  postgres  plpgsql.so         [.] exec_move_row
   1.73%  postgres  postgres           [.] OverrideSearchPathMatchesCurrent
   1.69%  postgres  plpgsql.so         [.] assign_simple_var
   1.63%  postgres  postgres           [.] heap_getnextslot
   1.60%  postgres  postgres           [.] SPI_plan_get_cached_plan
   1.55%  postgres  postgres           [.] heapgetpage
   1.47%  postgres  postgres           [.] heap_compute_data_size
   1.46%  postgres  postgres           [.] spi_printtup
   1.43%  postgres  postgres           [.] float8mul
   1.37%  postgres  postgres           [.] RevalidateCachedQuery
   1.36%  postgres  postgres           [.] standard_ExecutorRun
   1.35%  postgres  postgres           [.] recomputeNamespacePath
   1.28%  postgres  postgres           [.] ExecStoreBufferHeapTuple
   1.25%  postgres  postgres           [.] MemoryContextReset
   1.22%  postgres  plpgsql.so         [.] exec_eval_cleanup.isra.18
   1.20%  postgres  plpgsql.so         [.] exec_assign_expr
   1.05%  postgres  postgres           [.] SeqNext
   1.04%  postgres  postgres           [.] ResourceArrayRemove
   1.00%  postgres  postgres           [.] ScanQueryForLocks


Based on this profile it seems to me that plan cache overhead is relatively small:

2.43%+1.60%+1.37% < 6%

But from the other side ExecInterpExpr itself takes also about 5%.
I do not completely understand why JIT is not currently used for evaluation of SPI expressions
(why we call ExecInterpExpr and do not try  to compile this expression even if JIT is enabled).
But event if we do it and improve speed of expression evaluation 10 or more time, looks like
that effect on total query execution time will be also negligible (5%).

Most of the time is spent in pl_exec code, heap traversal , unpacking and copying tuple data.
Looks like it can not be easily optimized and requires serious rewriting of PL/pgSQL stuff.
 
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


Hi

I testing very simple function

create or replace function f1(int) returns int as $$ declare i int = 0; begin while i < $1 loop i = i + 1; end loop; return i; end $$ language plpgsql immutable;

profile - when function is marked as immutable

   8,65%  postgres    [.] ExecInterpExpr                                                                                              ▒
   8,59%  postgres    [.] AcquireExecutorLocks                                                                                        ▒
   6,95%  postgres    [.] OverrideSearchPathMatchesCurrent                                                                            ▒
   5,72%  plpgsql.so  [.] plpgsql_param_eval_var                                                                                      ▒
   5,15%  postgres    [.] AcquirePlannerLocks                                                                                         ▒
   4,54%  postgres    [.] RevalidateCachedQuery                                                                                       ▒
   4,52%  postgres    [.] GetCachedPlan                                                                                               ▒
   3,82%  postgres    [.] ResourceArrayRemove                                                                                         ▒
   2,87%  postgres    [.] SPI_plan_get_cached_plan                                                                                    ▒
   2,80%  plpgsql.so  [.] exec_eval_expr                                                                                              ▒
   2,70%  plpgsql.so  [.] exec_assign_value                                                                                           ▒
   2,55%  plpgsql.so  [.] exec_stmt                                                                                                   ▒
   2,53%  postgres    [.] recomputeNamespacePath                                                                                      ▒
   2,39%  plpgsql.so  [.] exec_cast_value                                                                                             ▒
   2,19%  postgres    [.] int4pl                                                                                                      ▒
   2,13%  postgres    [.] int4lt                                                                                                      ▒
   1,98%  postgres    [.] CheckCachedPlan    

volatile

   7,21%  postgres      [.] GetSnapshotData
   6,92%  plpgsql.so    [.] exec_eval_simple_expr
   5,79%  postgres      [.] AcquireExecutorLocks
   5,57%  postgres      [.] ExecInterpExpr
   4,12%  postgres      [.] LWLockRelease
   3,68%  postgres      [.] OverrideSearchPathMatchesCurrent
   3,64%  postgres      [.] PopActiveSnapshot
   3,36%  plpgsql.so    [.] plpgsql_param_eval_var
   3,31%  postgres      [.] LWLockAttemptLock
   3,13%  postgres      [.] AllocSetAlloc
   2,91%  postgres      [.] GetCachedPlan
   2,79%  postgres      [.] MemoryContextAlloc
   2,76%  postgres      [.] AcquirePlannerLocks
   2,70%  postgres      [.] ResourceArrayRemove
   2,45%  postgres      [.] PushActiveSnapshot
   2,44%  postgres      [.] RevalidateCachedQuery
   2,29%  postgres      [.] SPI_plan_get_cached_plan
   2,18%  postgres      [.] CopySnapshot
   1,95%  postgres      [.] AllocSetFree
   1,81%  postgres      [.] LWLockAcquire
   1,71%  plpgsql.so    [.] exec_assign_value
   1,61%  plpgsql.so    [.] exec_stmt
   1,59%  plpgsql.so    [.] exec_eval_expr
   1,48%  postgres      [.] int4pl
   1,48%  postgres      [.] CheckCachedPlan
   1,40%  plpgsql.so    [.] exec_cast_value
   1,39%  postgres      [.] int4lt
   1,38%  postgres      [.] recomputeNamespacePath
   1,25%  plpgsql.so    [.] exec_eval_cleanup
   1,08%  postgres      [.] ScanQueryForLocks
   1,01%  plpgsql.so    [.] exec_eval_boolean
   1,00%  postgres      [.] pfree

For tested function almost all CPU should be used for int4pl and int4lt functions - but there are used only 4% together. I think so almost all of

   8,59%  postgres    [.] AcquireExecutorLocks                                                                                        ▒
   6,95%  postgres    [.] OverrideSearchPathMatchesCurrent                                                                            ▒
   5,72%  plpgsql.so  [.] plpgsql_param_eval_var                                                                                      ▒
   5,15%  postgres    [.] AcquirePlannerLocks                                                                                         ▒
   4,54%  postgres    [.] RevalidateCachedQuery                                                                                       ▒
   4,52%  postgres    [.] GetCachedPlan                                                                                               ▒
   3,82%  postgres    [.] ResourceArrayRemove                                                                                         ▒
   2,87%  postgres    [.] SPI_plan_get_cached_plan                                                                                    ▒
   2,53%  postgres    [.] recomputeNamespacePath                                                                                      ▒

can be reduced if we know so we should to call just builtin immutable V1 functions.

My example is a extrem - when you use any embedded SQL, then the profile will be significantly changed. But for some cases there can be nice some significant speedup of expressions only functions (like PostGIS)

Regards

Pavel

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:
Hi

pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 23.08.2019 14:42, Pavel Stehule wrote:

In reality it is not IMMUTABLE function. On second hand, there are lot of application that depends on this behave.

It is well know trick how to reduce estimation errors related to JOINs. When immutable function has constant parameters, then it is evaluated in planning time.

So sometimes was necessary to use

SELECT ... FROM tab WHERE foreign_key = immutable_function('constant parameter')

instead JOIN.

It is ugly, but it is working perfectly. I think so until we will have multi table statistics, this behave should be available in Postgres.

Sure, this function should not be used for functional indexes.
 

What about the following version of the patch?

I am sending review of this small patch.

This small patch reduce a overhead of usage buildin immutable functions in volatile functions with simple trick. Starts snapshot only when it is necessary.

In decrease runtime time about 25 % on this small example.

do $$
declare i int;
begin
  i := 0;
  while i < 10000000
  loop
    i := i + 1;
  end loop;
end;
$$;

If there are more expressions, then speedup can be more interesting. If there are other bottlenecks, then the speedup will be less. 25% is not bad, so we want to this feature.

I believe so similar method can be used more aggressively with more significant performance benefit, but this is low hanging fruit and isn't reason to wait for future.

This patch doesn't introduce any new feature, so new tests and new doc is not necessary.
The patch is readable, well  formatted, only comments are too long. I fixed it.
All tests passed
I fixed few warnings, and I reformated little bit function expr_needs_snapshot to use if instead case, what is more usual in these cases.

I think so this code can be marked as ready for commit

Regards

Pavel




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: Why overhead of SPI is so large?

От
Kyotaro Horiguchi
Дата:
Hello.

At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> Hi
>
> pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 23.08.2019 14:42, Pavel Stehule wrote:
> >
> >
> > In reality it is not IMMUTABLE function. On second hand, there are lot of
> > application that depends on this behave.
> >
> > It is well know trick how to reduce estimation errors related to JOINs.
> > When immutable function has constant parameters, then it is evaluated in
> > planning time.
> >
> > So sometimes was necessary to use
> >
> > SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
> > parameter')
> >
> > instead JOIN.
> >
> > It is ugly, but it is working perfectly. I think so until we will have
> > multi table statistics, this behave should be available in Postgres.
> >
> > Sure, this function should not be used for functional indexes.
> >
> >
> >
> > What about the following version of the patch?
> >
>
> I am sending review of this small patch.
>
> This small patch reduce a overhead of usage buildin immutable functions in
> volatile functions with simple trick. Starts snapshot only when it is
> necessary.
>
> In decrease runtime time about 25 % on this small example.
>
> do $$
> declare i int;
> begin
>   i := 0;
>   while i < 10000000
>   loop
>     i := i + 1;
>   end loop;
> end;
> $$;
>
> If there are more expressions, then speedup can be more interesting. If
> there are other bottlenecks, then the speedup will be less. 25% is not bad,
> so we want to this feature.
>
> I believe so similar method can be used more aggressively with more
> significant performance benefit, but this is low hanging fruit and isn't
> reason to wait for future.
>
> This patch doesn't introduce any new feature, so new tests and new doc is
> not necessary.
> The patch is readable, well  formatted, only comments are too long. I fixed
> it.
> All tests passed
> I fixed few warnings, and I reformated little bit function
> expr_needs_snapshot to use if instead case, what is more usual in these
> cases.
>
> I think so this code can be marked as ready for commit

I have some comments on this.

expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.

I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com> napsal:
Hello.

At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> Hi
>
> pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 23.08.2019 14:42, Pavel Stehule wrote:
> >
> >
> > In reality it is not IMMUTABLE function. On second hand, there are lot of
> > application that depends on this behave.
> >
> > It is well know trick how to reduce estimation errors related to JOINs.
> > When immutable function has constant parameters, then it is evaluated in
> > planning time.
> >
> > So sometimes was necessary to use
> >
> > SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
> > parameter')
> >
> > instead JOIN.
> >
> > It is ugly, but it is working perfectly. I think so until we will have
> > multi table statistics, this behave should be available in Postgres.
> >
> > Sure, this function should not be used for functional indexes.
> >
> >
> >
> > What about the following version of the patch?
> >
>
> I am sending review of this small patch.
>
> This small patch reduce a overhead of usage buildin immutable functions in
> volatile functions with simple trick. Starts snapshot only when it is
> necessary.
>
> In decrease runtime time about 25 % on this small example.
>
> do $$
> declare i int;
> begin
>   i := 0;
>   while i < 10000000
>   loop
>     i := i + 1;
>   end loop;
> end;
> $$;
>
> If there are more expressions, then speedup can be more interesting. If
> there are other bottlenecks, then the speedup will be less. 25% is not bad,
> so we want to this feature.
>
> I believe so similar method can be used more aggressively with more
> significant performance benefit, but this is low hanging fruit and isn't
> reason to wait for future.
>
> This patch doesn't introduce any new feature, so new tests and new doc is
> not necessary.
> The patch is readable, well  formatted, only comments are too long. I fixed
> it.
> All tests passed
> I fixed few warnings, and I reformated little bit function
> expr_needs_snapshot to use if instead case, what is more usual in these
> cases.
>
> I think so this code can be marked as ready for commit

I have some comments on this.

expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.

I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.

has sense

Pavel


regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 07.11.2019 15:09, Pavel Stehule wrote:


čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com> napsal:
Hello.

At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> Hi
>
> pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 23.08.2019 14:42, Pavel Stehule wrote:
> >
> >
> > In reality it is not IMMUTABLE function. On second hand, there are lot of
> > application that depends on this behave.
> >
> > It is well know trick how to reduce estimation errors related to JOINs.
> > When immutable function has constant parameters, then it is evaluated in
> > planning time.
> >
> > So sometimes was necessary to use
> >
> > SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
> > parameter')
> >
> > instead JOIN.
> >
> > It is ugly, but it is working perfectly. I think so until we will have
> > multi table statistics, this behave should be available in Postgres.
> >
> > Sure, this function should not be used for functional indexes.
> >
> >
> >
> > What about the following version of the patch?
> >
>
> I am sending review of this small patch.
>
> This small patch reduce a overhead of usage buildin immutable functions in
> volatile functions with simple trick. Starts snapshot only when it is
> necessary.
>
> In decrease runtime time about 25 % on this small example.
>
> do $$
> declare i int;
> begin
>   i := 0;
>   while i < 10000000
>   loop
>     i := i + 1;
>   end loop;
> end;
> $$;
>
> If there are more expressions, then speedup can be more interesting. If
> there are other bottlenecks, then the speedup will be less. 25% is not bad,
> so we want to this feature.
>
> I believe so similar method can be used more aggressively with more
> significant performance benefit, but this is low hanging fruit and isn't
> reason to wait for future.
>
> This patch doesn't introduce any new feature, so new tests and new doc is
> not necessary.
> The patch is readable, well  formatted, only comments are too long. I fixed
> it.
> All tests passed
> I fixed few warnings, and I reformated little bit function
> expr_needs_snapshot to use if instead case, what is more usual in these
> cases.
>
> I think so this code can be marked as ready for commit

I have some comments on this.

expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.

I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.

has sense

There are  62 cases handled by expression_tree_walker.
I have inspected all of them and considered only these 6 to be "dangerous": intentionally require snapshot.
FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.

So if I use "white-list" approach, I have to enumerate all other 62-6=56 cases in expr_needs_snapshot function.
Frankly speaking I do not this that it is good idea. New nodes are rarely added to the Postgres and expression_tree_walker
in any case has to be changed to handle this new nodes. There are many other places where tree walker is used to check presence (or absence)
of some properties in the tree. And none is this places assume that some newly introduced node may require special handling of this property check.

In any case, if you think that explicit enumerations of this 56 cases (or some subset of them) is good idea, then I will do it.
If you have concerns about some particular nodes, I can add them to  "black list".


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 8. 11. 2019 v 14:31 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 07.11.2019 15:09, Pavel Stehule wrote:


čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com> napsal:
Hello.

At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> Hi
>
> pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>
> >
> >
> > On 23.08.2019 14:42, Pavel Stehule wrote:
> >
> >
> > In reality it is not IMMUTABLE function. On second hand, there are lot of
> > application that depends on this behave.
> >
> > It is well know trick how to reduce estimation errors related to JOINs.
> > When immutable function has constant parameters, then it is evaluated in
> > planning time.
> >
> > So sometimes was necessary to use
> >
> > SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
> > parameter')
> >
> > instead JOIN.
> >
> > It is ugly, but it is working perfectly. I think so until we will have
> > multi table statistics, this behave should be available in Postgres.
> >
> > Sure, this function should not be used for functional indexes.
> >
> >
> >
> > What about the following version of the patch?
> >
>
> I am sending review of this small patch.
>
> This small patch reduce a overhead of usage buildin immutable functions in
> volatile functions with simple trick. Starts snapshot only when it is
> necessary.
>
> In decrease runtime time about 25 % on this small example.
>
> do $$
> declare i int;
> begin
>   i := 0;
>   while i < 10000000
>   loop
>     i := i + 1;
>   end loop;
> end;
> $$;
>
> If there are more expressions, then speedup can be more interesting. If
> there are other bottlenecks, then the speedup will be less. 25% is not bad,
> so we want to this feature.
>
> I believe so similar method can be used more aggressively with more
> significant performance benefit, but this is low hanging fruit and isn't
> reason to wait for future.
>
> This patch doesn't introduce any new feature, so new tests and new doc is
> not necessary.
> The patch is readable, well  formatted, only comments are too long. I fixed
> it.
> All tests passed
> I fixed few warnings, and I reformated little bit function
> expr_needs_snapshot to use if instead case, what is more usual in these
> cases.
>
> I think so this code can be marked as ready for commit

I have some comments on this.

expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.

I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.

has sense

There are  62 cases handled by expression_tree_walker.
I have inspected all of them and considered only these 6 to be "dangerous": intentionally require snapshot.
FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.

So if I use "white-list" approach, I have to enumerate all other 62-6=56 cases in expr_needs_snapshot function.
Frankly speaking I do not this that it is good idea. New nodes are rarely added to the Postgres and expression_tree_walker
in any case has to be changed to handle this new nodes. There are many other places where tree walker is used to check presence (or absence)
of some properties in the tree. And none is this places assume that some newly introduced node may require special handling of this property check.

In any case, if you think that explicit enumerations of this 56 cases (or some subset of them) is good idea, then I will do it.
If you have concerns about some particular nodes, I can add them to  "black list".

Is question how to implement this feature safely. Isn't possible to use same check for functional indexes?




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Tom Lane
Дата:
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
>> čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi 
>> <horikyota.ntt@gmail.com <mailto:horikyota.ntt@gmail.com>> napsal:
>>> I might be too worrying, but maybe we should write the function in
>>> white-listed way, that is, expr_needs_snapshot returns false only if
>>> the whole tree consists of only the node types known to the
>>> function. And any unknown node makes the function return true
>>> immediately.

> There are  62 cases handled by expression_tree_walker.
> I have inspected all of them and considered only these 6 to be 
> "dangerous": intentionally require snapshot.
> FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.

This is false on its face.  For example, considering OpExpr but not
its aliases DistinctExpr or NullIfExpr is just obviously wrong.
ScalarArrayOpExpr is another node that might contain a user-defined
function.  CoerceViaIO calls I/O functions that might easily not be
immutable.  RowCompareExpr calls comparison functions that might
not be either (they are btree comparison functions, but they could
be cross-type comparisons, and we only require those to be stable).
CoerceToDomain could invoke just about anything at all.  Need I
go on?

> Frankly speaking I do not this that it is good idea. New nodes are 
> rarely added to the Postgres and expression_tree_walker
> in any case has to be changed to handle this new nodes. There are many 
> other places where tree walker is used to check presence (or absence)
> of some properties in the tree. And none is this places assume that some 
> newly introduced node may require special handling of this property check.

None of those statements are true, in my experience.

In general, this patch seems like it's learned nothing from our
experiences with the late and unlamented exec_simple_check_node()
(cf commit 00418c612).  Having a piece of plpgsql that has to know
about all possible "simple expression" node types is just a major
maintenance headache; but there is no short-cut solution that is
really going to be acceptable.  Either you're unsafe, or you fail
to optimize cases that users will complain about, or you write
and maintain a lot of code.

I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
tests.  Those are expensive, requiring additional catalog lookups,
and they prove just about nothing.  There are extensions that shove
stuff into pg_catalog (look no further than contrib/adminpack), or
users could do it, so you really still are relying on the whole world
to get immutability right.  If we're going to not trust immutability
markings on user-defined objects, I'd be inclined to do it by
checking that the object's OID is less than FirstGenbkiObjectId.

But maybe we should just forget that bit of paranoia and rely solely
on contain_mutable_functions().  That gets rid of the new maintenance
requirement, and I think arguably it's OK.  An "immutable" function
whose result depends on changes that were just made by the calling
function is pretty obviously mislabeled, so people won't have much of
a leg to stand on if they complain about that.  Pavel's argument
upthread that people sometimes intentionally mislabel functions as
immutable isn't really relevant: in his example, at least, the user
is intentionally trying to get the function to be evaluated early.
So whether it sees the effects of changes just made by the calling
function doesn't seem like it would matter to such a user.

            regards, tom lane



Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:

On 11.11.2019 20:22, Tom Lane wrote:
>
> None of those statements are true, in my experience.
>
> In general, this patch seems like it's learned nothing from our
> experiences with the late and unlamented exec_simple_check_node()
> (cf commit 00418c612).  Having a piece of plpgsql that has to know
> about all possible "simple expression" node types is just a major
> maintenance headache; but there is no short-cut solution that is
> really going to be acceptable.  Either you're unsafe, or you fail
> to optimize cases that users will complain about, or you write
> and maintain a lot of code.
>
> I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
> tests.  Those are expensive, requiring additional catalog lookups,
> and they prove just about nothing.  There are extensions that shove
> stuff into pg_catalog (look no further than contrib/adminpack), or
> users could do it, so you really still are relying on the whole world
> to get immutability right.  If we're going to not trust immutability
> markings on user-defined objects, I'd be inclined to do it by
> checking that the object's OID is less than FirstGenbkiObjectId.
>
> But maybe we should just forget that bit of paranoia and rely solely
> on contain_mutable_functions().  That gets rid of the new maintenance
> requirement, and I think arguably it's OK.  An "immutable" function
> whose result depends on changes that were just made by the calling
> function is pretty obviously mislabeled, so people won't have much of
> a leg to stand on if they complain about that.  Pavel's argument
> upthread that people sometimes intentionally mislabel functions as
> immutable isn't really relevant: in his example, at least, the user
> is intentionally trying to get the function to be evaluated early.
> So whether it sees the effects of changes just made by the calling
> function doesn't seem like it would matter to such a user.
>
>             regards, tom lane

In my opinion contain_mutable_functions() is the best solution.
But if it is not acceptable, I will rewrite the patch in white-list fashion.

I do not understand the argument about expensive 
is-it-in-the-pg_catalog-schema test.
IsCatalogNameaspace is doing just simple comparison without any catalog 
lookups:

bool
IsCatalogNamespace(Oid namespaceId)
{
     return namespaceId == PG_CATALOG_NAMESPACE;
}

I may agree that it "proves nothing" if extension can put stuff in 
pg_catalog.
I can replace it with comparison with FirstGenbkiObjectId.
But all this makes sense only if using contain_mutable_function() is not 
acceptable.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Why overhead of SPI is so large?

От
Kyotaro Horiguchi
Дата:
At Tue, 12 Nov 2019 11:27:24 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
>
>
> On 11.11.2019 20:22, Tom Lane wrote:
> >
> > None of those statements are true, in my experience.
> >
> > In general, this patch seems like it's learned nothing from our
> > experiences with the late and unlamented exec_simple_check_node()
> > (cf commit 00418c612).  Having a piece of plpgsql that has to know
> > about all possible "simple expression" node types is just a major
> > maintenance headache; but there is no short-cut solution that is
> > really going to be acceptable.  Either you're unsafe, or you fail
> > to optimize cases that users will complain about, or you write
> > and maintain a lot of code.
> >
> > I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
> > tests.  Those are expensive, requiring additional catalog lookups,
> > and they prove just about nothing.  There are extensions that shove
> > stuff into pg_catalog (look no further than contrib/adminpack), or
> > users could do it, so you really still are relying on the whole world
> > to get immutability right.  If we're going to not trust immutability
> > markings on user-defined objects, I'd be inclined to do it by
> > checking that the object's OID is less than FirstGenbkiObjectId.
> >
> > But maybe we should just forget that bit of paranoia and rely solely
> > on contain_mutable_functions().  That gets rid of the new maintenance
> > requirement, and I think arguably it's OK.  An "immutable" function
> > whose result depends on changes that were just made by the calling
> > function is pretty obviously mislabeled, so people won't have much of
> > a leg to stand on if they complain about that.  Pavel's argument
> > upthread that people sometimes intentionally mislabel functions as
> > immutable isn't really relevant: in his example, at least, the user
> > is intentionally trying to get the function to be evaluated early.
> > So whether it sees the effects of changes just made by the calling
> > function doesn't seem like it would matter to such a user.
> >
> >             regards, tom lane
>
> In my opinion contain_mutable_functions() is the best solution.
> But if it is not acceptable, I will rewrite the patch in white-list
> fashion.

I agree for just relying on contain_mutable_functions for the same
reasons to Tom.

> I do not understand the argument about expensive
> is-it-in-the-pg_catalog-schema test.
> IsCatalogNameaspace is doing just simple comparison without any
> catalog lookups:
>
> bool
> IsCatalogNamespace(Oid namespaceId)
> {
>     return namespaceId == PG_CATALOG_NAMESPACE;
> }
>
> I may agree that it "proves nothing" if extension can put stuff in
> pg_catalog.
> I can replace it with comparison with FirstGenbkiObjectId.
> But all this makes sense only if using contain_mutable_function() is
> not acceptable.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Why overhead of SPI is so large?

От
Tom Lane
Дата:
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Tue, 12 Nov 2019 11:27:24 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
>> In my opinion contain_mutable_functions() is the best solution.
>> But if it is not acceptable, I will rewrite the patch in white-list
>> fashion.

> I agree for just relying on contain_mutable_functions for the same
> reasons to Tom.

I've set the CF entry to "Waiting on Author" pending a new patch
that does it like that.

>> I do not understand the argument about expensive
>> is-it-in-the-pg_catalog-schema test.
>> IsCatalogNameaspace is doing just simple comparison without any
>> catalog lookups:

As far as that goes, get_func_namespace() is the expensive part,
not IsCatalogNamespace().  If we were going to go down this path,
it'd perhaps be worthwhile to expand that and the adjacent
func_volatile() test into bulkier code that just does one syscache
fetch of the pg_proc row.  But we're not, so it's moot.

            regards, tom lane



Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:
> I've set the CF entry to "Waiting on Author" pending a new patch
> that does it like that.

With contain_mutable_functions the patch becomes trivial.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:

> I've set the CF entry to "Waiting on Author" pending a new patch
> that does it like that.

With contain_mutable_functions the patch becomes trivial.

Stable functions doesn't need own snapshot too, so it is not fully correct, but it is on safe side.

Pavel


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Why overhead of SPI is so large?

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>> With contain_mutable_functions the patch becomes trivial.

> Stable functions doesn't need own snapshot too, so it is not fully correct,
> but it is on safe side.

No, I doubt that.  A stable function is allowed to inspect database state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.

            regards, tom lane



Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> napsal:
>> With contain_mutable_functions the patch becomes trivial.

> Stable functions doesn't need own snapshot too, so it is not fully correct,
> but it is on safe side.

No, I doubt that.  A stable function is allowed to inspect database state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.

for this I need new snapshot?


                        regards, tom lane

Re: Why overhead of SPI is so large?

От
Kyotaro Horiguchi
Дата:
At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>
> > Pavel Stehule <pavel.stehule@gmail.com> writes:
> > > čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
> > > k.knizhnik@postgrespro.ru> napsal:
> > >> With contain_mutable_functions the patch becomes trivial.
> >
> > > Stable functions doesn't need own snapshot too, so it is not fully
> > correct,
> > > but it is on safe side.
> >
> > No, I doubt that.  A stable function is allowed to inspect database state,
> > and if it's being called by a volatile function, it has every right to
> > expect that it'd see updates-so-far made by the volatile function.
>
> for this I need new snapshot?

It depends on what we regard as "query" or "command" here. It seems to
me that every line in a plpgsql function is regarded as a "query" for
stable function, even if the function is called in another "query".

In short, we need it, I think.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 22. 11. 2019 v 7:33 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com> napsal:
At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>
> > Pavel Stehule <pavel.stehule@gmail.com> writes:
> > > čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
> > > k.knizhnik@postgrespro.ru> napsal:
> > >> With contain_mutable_functions the patch becomes trivial.
> >
> > > Stable functions doesn't need own snapshot too, so it is not fully
> > correct,
> > > but it is on safe side.
> >
> > No, I doubt that.  A stable function is allowed to inspect database state,
> > and if it's being called by a volatile function, it has every right to
> > expect that it'd see updates-so-far made by the volatile function.
>
> for this I need new snapshot?

It depends on what we regard as "query" or "command" here. It seems to
me that every line in a plpgsql function is regarded as a "query" for
stable function, even if the function is called in another "query".

In short, we need it, I think.

ok, then the limits just to only immutable functions is wrong

I test it, and there is a problem already. We doesn't raise a exception, but the result is wrong


create table foo(a int);

create or replace function f1(int)
returns void as $$
begin
  insert into foo values($1);
end;
$$ language plpgsql;

create or replace function f2(int)
returns void as $$declare r record;
begin
  perform f1(); for r in select * from foo loop raise notice '%', r; end loop;
end;
$$ language plpgsql immutable; -- or stable with same behave

So current state is:

a) we allow to call volatile functions from nonvolatile (stable, immutable) that really does write
b) but this change is not visible in parent nonvolatile functions. Is visible only in volatile functions.

Is it expected behave?

So now, there are described issues already. And the limit to just immutable function is not enough - these functions should be immutable buildin.

The core of these problems is based on function's flags related to planner optimizations.

Maybe other flags WRITE | READ | PURE can help.

Now we don't know if volatile function writes to database or not - surely random function doesn't do this. We can implement new set of flags, that can reduce a overhead with snapshots.

The function random() can be VOLATILE PURE - and we will know, so  result of this function is not stable, but this function doesn't touch data engine.

When we don't have these flags, then the current logic is used, when we have these flags, then it will be used. These flags can be more strict

we should not to allow any WRITE from READ function, or we should not allow READ from PURE functions.

Notes, comments?

Pavel


This test should

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 22.11.2019 10:08, Pavel Stehule wrote:

I test it, and there is a problem already. We doesn't raise a exception, but the result is wrong


create table foo(a int);

create or replace function f1(int)
returns void as $$
begin
  insert into foo values($1);
end;
$$ language plpgsql;

create or replace function f2(int)
returns void as $$declare r record;
begin
  perform f1(); for r in select * from foo loop raise notice '%', r; end loop;
end;
$$ language plpgsql immutable; -- or stable with same behave

So current state is:

a) we allow to call volatile functions from nonvolatile (stable, immutable) that really does write
b) but this change is not visible in parent nonvolatile functions. Is visible only in volatile functions.

Is it expected behave?

I think that in theory it is definitely not correct to call volatile function from non-volatile.
But there are two questions:
1. Are we able to check it? Please taken in account that:
 - at the moment of "create function f2()"  called function f1() may not yet be defined
 - instead of perform f1() it can do "execute 'select f1()'" and it is not possible to check it at compile time. 
2. Is it responsibility of programmer to correctly specify function properties or it should be done by compiler?
  If we follow YWIYGI rule, then your definition of f2() is not correct and that it is why you will get wrong result in this case.
  If we what to completely rely on compiler, then... we do not not volatile/immutable/stable/qualifiers at all! Compiler should deduce this information itself.
  But it will be non-trivial if ever possible, take in account 1)

In principle it is possible to add checks which will produce warning in case of calling volatile function or executing dynamic SQL from non-volatile function.
If such extra checking will be considered useful, I can propose patch doing it.
But IMHO optimizer should rely on function qualifier provided by programmer and it is acceptable to produce wrong result if this information is not correct.


So now, there are described issues already. And the limit to just immutable function is not enough - these functions should be immutable buildin.

The core of these problems is based on function's flags related to planner optimizations.

Maybe other flags WRITE | READ | PURE can help.

Now we don't know if volatile function writes to database or not - surely random function doesn't do this. We can implement new set of flags, that can reduce a overhead with snapshots.

The function random() can be VOLATILE PURE - and we will know, so  result of this function is not stable, but this function doesn't touch data engine.

When we don't have these flags, then the current logic is used, when we have these flags, then it will be used. These flags can be more strict

we should not to allow any WRITE from READ function, or we should not allow READ from PURE functions.

Notes, comments?
I think that even current model with "volatile", "immutable" and "stable" is complex enough.
Adding more qualifiers will make it even more obscure and error-prone.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Pavel Stehule
Дата:


pá 22. 11. 2019 v 8:32 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 22.11.2019 10:08, Pavel Stehule wrote:

I test it, and there is a problem already. We doesn't raise a exception, but the result is wrong


create table foo(a int);

create or replace function f1(int)
returns void as $$
begin
  insert into foo values($1);
end;
$$ language plpgsql;

create or replace function f2(int)
returns void as $$declare r record;
begin
  perform f1(); for r in select * from foo loop raise notice '%', r; end loop;
end;
$$ language plpgsql immutable; -- or stable with same behave

So current state is:

a) we allow to call volatile functions from nonvolatile (stable, immutable) that really does write
b) but this change is not visible in parent nonvolatile functions. Is visible only in volatile functions.

Is it expected behave?

I think that in theory it is definitely not correct to call volatile function from non-volatile.
But there are two questions:
1. Are we able to check it? Please taken in account that:
 - at the moment of "create function f2()"  called function f1() may not yet be defined
 - instead of perform f1() it can do "execute 'select f1()'" and it is not possible to check it at compile time. 

It's not possible to check it compile time now.

2. Is it responsibility of programmer to correctly specify function properties or it should be done by compiler?
  If we follow YWIYGI rule, then your definition of f2() is not correct and that it is why you will get wrong result in this case.

maybe - a) but it is not documented, b) is not a postgresql's philosophy return bad result - and c) it is not user friendly. There should be raised error or result should be correct.

Theoretically this feature can be used for logging - you can write to log table from immutable or stable function - so it can has some use case. Probably if we implement autonomous transactions, then this behave should be correct.
 
  If we what to completely rely on compiler, then... we do not not volatile/immutable/stable/qualifiers at all! Compiler should deduce this information itself.
  But it will be non-trivial if ever possible, take in account 1)

I am able to do it in plpgsql_check for plpgsql. But probably it is not possible do it for PLPerl, PLPythonu, ..

In principle it is possible to add checks which will produce warning in case of calling volatile function or executing dynamic SQL from non-volatile function.
If such extra checking will be considered useful, I can propose patch doing it.
But IMHO optimizer should rely on function qualifier provided by programmer and it is acceptable to produce wrong result if this information is not correct.

We should to distinguish  between bad result and not well optimized plan.




So now, there are described issues already. And the limit to just immutable function is not enough - these functions should be immutable buildin.

The core of these problems is based on function's flags related to planner optimizations.

Maybe other flags WRITE | READ | PURE can help.

Now we don't know if volatile function writes to database or not - surely random function doesn't do this. We can implement new set of flags, that can reduce a overhead with snapshots.

The function random() can be VOLATILE PURE - and we will know, so  result of this function is not stable, but this function doesn't touch data engine.

When we don't have these flags, then the current logic is used, when we have these flags, then it will be used. These flags can be more strict

we should not to allow any WRITE from READ function, or we should not allow READ from PURE functions.

Notes, comments?
I think that even current model with "volatile", "immutable" and "stable" is complex enough.
Adding more qualifiers will make it even more obscure and error-prone.

I don't think - the classic example is random() function. It's volatile, but you don't need special snapshot for calling this function.

Still any change of behave can breaks lot of applications, because how we can see, Postgres is very tolerant now.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Konstantin Knizhnik
Дата:


On 22.11.2019 11:05, Pavel Stehule wrote:


We should to distinguish  between bad result and not well optimized plan.

If it is not possible to implement runtime check tha timmutable function is not making any changes in database.
Please notice, that even right now without any get snapshot optimization Postgres can produce incorrect result in case of incorrectly specidied immutable or stable qualifiers.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Why overhead of SPI is so large?

От
Kyotaro Horiguchi
Дата:
At Fri, 22 Nov 2019 08:08:24 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> pá 22. 11. 2019 v 7:33 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> napsal:
>
> > At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com>
> > wrote in
> > > čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
> > >
> > > > Pavel Stehule <pavel.stehule@gmail.com> writes:
> > > > > čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
> > > > > k.knizhnik@postgrespro.ru> napsal:
> > > > >> With contain_mutable_functions the patch becomes trivial.
> > > >
> > > > > Stable functions doesn't need own snapshot too, so it is not fully
> > > > correct,
> > > > > but it is on safe side.
> > > >
> > > > No, I doubt that.  A stable function is allowed to inspect database
> > state,
> > > > and if it's being called by a volatile function, it has every right to
> > > > expect that it'd see updates-so-far made by the volatile function.
> > >
> > > for this I need new snapshot?
> >
> > It depends on what we regard as "query" or "command" here. It seems to
> > me that every line in a plpgsql function is regarded as a "query" for
> > stable function, even if the function is called in another "query".
> >
> > In short, we need it, I think.
> >
>
> ok, then the limits just to only immutable functions is wrong
>
> I test it, and there is a problem already. We doesn't raise a exception,
> but the result is wrong
>
> create table foo(a int);
>
> create or replace function f1(int)
> returns void as $$
> begin
>   insert into foo values($1);
> end;
> $$ language plpgsql;
>
> create or replace function f2(int)
> returns void as $$declare r record;
> begin
>   perform f1(); for r in select * from foo loop raise notice '%', r; end
> loop;
> end;
> $$ language plpgsql immutable; -- or stable with same behave
>
> So current state is:
>
> a) we allow to call volatile functions from nonvolatile (stable, immutable)
> that really does write
> b) but this change is not visible in parent nonvolatile functions. Is
> visible only in volatile functions.
>
> Is it expected behave?

I'm not sure, volatility is the total bahavior of the function,
regardless of whatever the function does internally. Even though I'm
not sure how to use volatile functions in non-volatile functions, I
don't see direct reason or how to inhibit that and I think we don't
even need to bother that. It's owe to the definier of the function.

> So now, there are described issues already. And the limit to just immutable
> function is not enough - these functions should be immutable buildin.
>
> The core of these problems is based on function's flags related to planner
> optimizations.
>
> Maybe other flags WRITE | READ | PURE can help.
>
> Now we don't know if volatile function writes to database or not - surely
> random function doesn't do this. We can implement new set of flags, that
> can reduce a overhead with snapshots.
>
> The function random() can be VOLATILE PURE - and we will know, so  result
> of this function is not stable, but this function doesn't touch data engine.
>
> When we don't have these flags, then the current logic is used, when we
> have these flags, then it will be used. These flags can be more strict
>
> we should not to allow any WRITE from READ function, or we should not allow
> READ from PURE functions.
>
> Notes, comments?
>
> Pavel

I think such fine-grained categorization is beyond our maintainance
capability and users still much cannot follow.  I recall of an extreme
discussion that the result of immutable system functions still can
mutate beyond major version upgrade. Volatility is mere a hint and,
perhaps as Robert said somewhere, the baseline we should do is to
avoid crash for any possible configuration.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Why overhead of SPI is so large?

От
Tom Lane
Дата:
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Fri, 22 Nov 2019 08:08:24 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
>> a) we allow to call volatile functions from nonvolatile (stable, immutable)
>> that really does write
>> b) but this change is not visible in parent nonvolatile functions. Is
>> visible only in volatile functions.
>>
>> Is it expected behave?

> I'm not sure, volatility is the total bahavior of the function,
> regardless of whatever the function does internally. Even though I'm
> not sure how to use volatile functions in non-volatile functions, I
> don't see direct reason or how to inhibit that and I think we don't
> even need to bother that. It's owe to the definier of the function.

I'm pretty sure we had this discussion long ago when we implemented
the existing read-only-function restrictions in plpgsql.  Yeah, in
principle an immutable function should refuse to call non-immutable
functions, but the practical costs and benefits of doing that aren't
very attractive.  The existing rules are inexpensive to enforce and
catch most mistakes in this area.  Catching the other few percent of
mistakes would require a really significant change, not only on our
part but also users' parts --- for example, forgetting to label a
function as immutable when it could be might break your application
entirely.

I made some cosmetic fixes in the proposed patch and pushed it.

            regards, tom lane