Re: Inlining of couple of functions in pl_exec.c improves performance

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Inlining of couple of functions in pl_exec.c improves performance
Дата
Msg-id CAFj8pRCE_JmH3BcWDMMtXj02bmJijznfgq8rOEP1ibXARyt1OA@mail.gmail.com
обсуждение исходный текст
Ответ на Inlining of couple of functions in pl_exec.c improves performance  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: Inlining of couple of functions in pl_exec.c improves performance  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Список pgsql-hackers
Hi

so 23. 5. 2020 v 19:03 odesílatel Amit Khandekar <amitdkhan.pg@gmail.com> napsal:
There are a couple of function call overheads I observed in pl/pgsql
code : exec_stmt() and exec_cast_value(). Removing these overheads
resulted in some performance gains.

exec_stmt() :

plpgsql_exec_function() and other toplevel block executors currently
call exec_stmt(). But actually they don't need to do everything that
exec_stmt() does. So they can call a new function instead of
exec_stmt(), and all the exec_stmt() code can be moved to
exec_stmts(). The things that exec_stmt() do, but are not necessary
for a top level block stmt, are :

1. save_estmt = estate->err_stmt; estate->err_stmt = stmt;
For top level blocks, saving the estate->err_stmt is not necessary,
because there is no statement after this block statement. Anyways,
plpgsql_exec_function() assigns estate.err_stmt just before calling
exec_stmt so there is really no point in exec_stmt() setting it again.

2. CHECK_FOR_INTERRUPTS()
This is not necessary for toplevel block callers.

3. exec_stmt_block() can be directly called rather than exec_stmt()
because func->action is a block statement. So the switch statement is
not necessary.

But this one might be necessary for toplevel block statement:
  if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg)
     ((*plpgsql_plugin_ptr)->stmt_beg) (estate, stmt);

There was already a repetitive code in plpgsql_exec_function() and
other functions around the exec_stmt() call. So in a separate patch
0001*.patch, I moved that code into a common function
exec_toplevel_block(). In the main patch
0002-Get-rid-of-exec_stmt-function-call.patch, I additionally called
plpgsql_plugin_ptr->stmt_beg() inside exec_toplevel_block(). And moved
exec_stmt() code into exec_stmts().



exec_cast_value() :

This function does not do the casting if not required. So moved the
code that actually does the cast into a separate function, so as to
reduce the exec_cast_value() code and make it inline. Attached is the
0003-Inline-exec_cast_value.patch


Testing
----------

I used two available VMs (one x86_64 and the other arm64), and the
benefit showed up on both of these machines. Attached patches 0001,
0002, 0003 are to be applied in that order. 0001 is just a preparatory
patch.

First I tried with a simple for loop with a single assignment
(attached forcounter.sql)

By inlining of the two functions, found noticeable reduction in
execution time as shown (figures are in milliseconds, averaged over
multiple runs; taken from 'explain analyze' execution times) :
ARM VM :
   HEAD : 100 ; Patched : 88 => 13.6% improvement
x86 VM :
   HEAD :  71 ; Patched : 66 => 7.63% improvement.

Then I included many assignment statements as shown in attachment
assignmany.sql. This showed further benefit :
ARM VM :
   HEAD : 1820 ; Patched : 1549  => 17.5% improvement
x86 VM :
   HEAD : 1020 ; Patched :  869  => 17.4% improvement

Inlining just exec_stmt() showed the improvement mainly on the arm64
VM (7.4%). For x86, it was 2.7%
But inlining exec_stmt() and exec_cast_value() together showed
benefits on both machines, as can be seen above.

 
   FOR counter IN 1..1800000 LOOP
      id = 0; id = 0; id1 = 0;
      id2 = 0; id3 = 0; id1 = 0; id2 = 0;
      id3 = 0; id = 0; id = 0; id1 = 0;
      id2 = 0; id3 = 0; id1 = 0; id2 = 0;
      id3 = 0;
   END LOOP;

This is not too much typical PLpgSQL code. All expressions are not parametrized - so this test is little bit obscure.

Last strange performance plpgsql benchmark did calculation of pi value. It does something real

Regards

Pavel


--
Thanks,
-Amit Khandekar
Huawei Technologies

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Khandekar
Дата:
Сообщение: Inlining of couple of functions in pl_exec.c improves performance
Следующее
От: "Nikolay Samokhvalov"
Дата:
Сообщение: Re: Default gucs for EXPLAIN