Re: JIT compiling with LLVM v10.1

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: JIT compiling with LLVM v10.1
Дата
Msg-id 5a18282d-89d0-ba21-4d54-bc2259ad7f26@postgrespro.ru
обсуждение исходный текст
Ответ на Re: JIT compiling with LLVM v10.1  (Andres Freund <andres@anarazel.de>)
Ответы Re: JIT compiling with LLVM v10.1
Список pgsql-hackers


On 14.02.2018 21:17, Andres Freund wrote:
Hi,

On 2018-02-07 06:54:05 -0800, Andres Freund wrote:
I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.

The layout is now as follows:

src/backend/jit/jit.c:   Part of JITing always linked into the server. Supports loading the   LLVM using JIT library.

src/backend/jit/llvm/
Infrastructure:llvmjit.c:   General code generation and optimization infrastructurellvmjit_error.cpp, llvmjit_wrap.cpp:   Error / backward compat wrappersllvmjit_inline.cpp:   Cross module inlining support
Code-Gen: llvmjit_expr.c   Expression compilation llvmjit_deform.c   Deform compilation
I've pushed a revised version that hopefully should address Jeff's
wish/need of being able to experiment with this out of core. There's now
a "jit_provider" PGC_POSTMASTER GUC that's by default set to
"llvmjit". llvmjit.so is the .so implementing JIT using LLVM. It fills a
set of callbacks via
extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
which can also be implemented by any other potential provider.

The other two biggest changes are that I've added a README
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/backend/jit/README;hb=jit
and that I've revised the configure support so it does more error
checks, and moved it into config/llvm.m4.

There's a larger smattering of small changes too.

I'm pretty happy with how the separation of core / shlib looks now. I'm
planning to work on cleaning and then pushing some of the preliminary
patches (fixed tupledesc, grouping) over the next few days.

Greetings,

Andres Freund


I have made  some more experiments with efficiency of JIT-ing of deform tuple and I want to share this results (I hope that them will be interesting).
It is well known fact that Postgres spends most of the time in sequence scan queries for warm data in deforming tuples (17% in case of TPC-H Q1).
Postgres  tries to optimize access to the tuple by caching fixed size offsets to the fields whenever possible and loading attributes on demand.
It is also well know recommendation to put fixed size, non-null, frequently used attributes at the beginning of table's attribute list to make this optimization work more efficiently.
You can see in the code of heap_deform_tuple shows that first NULL value will switch it to "slow" mode:

for (attnum = 0; attnum < natts; attnum++)
    {
        Form_pg_attribute thisatt = TupleDescAttr(tupleDesc, attnum);

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }


I tried to investigate importance of this optimization and what is actual penalty of "slow" mode.
At the same time I want to understand how JIT help to speed-up tuple deforming.

I have populated with data three tables:

create table t1(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t2(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t3(id integer primary key,c1 integer not null,c2 integer not null,c3 integer not null,c4 integer not null,c5 integer not null,c6 integer not null,c7 integer not null,c8 integer not null,c9 integer not null);
insert into t1 (id,c1,c2,c3,c4,c5,c6,c7,c8) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t2 (id,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t3 (id,c1,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0,0);
vacuum analyze t1;
vacuum analyze t2;
vacuum analyze t3;

t1 contains null in last c9 column, t2 - in first c1 columns and t3 has all attributes declared as not-null (and JIT can use this knowledge to generate more efficient deforming code).
All data set is hold in memory (shared buffer size is greater than database size) and I intentionally switch off parallel execution to make results more deterministic.
I run two queries calculating aggregates on one/all not-null fields:

select sum(c8) from t*;
select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;

As expected 35% time was spent in heap_deform_tuple.
But results (msec) were slightly confusing and unexected:

select sum(c8) from t*;


w/o JIT
with JIT
t1763
563
t2772
570
t3
776
592

select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;


w/o JIT
with JIT
t11239 742
t21233 747
t3
1255 803

I repeat each query 10 times and take the minimal time ( I think that it is more meaningful than average time which depends on some other activity on the system).
So there is no big difference between "slow" and "fast" ways of deforming tuple.
Moreover, for sometimes "slow" case is faster. Although I have to say that variance of results is quite large: about 10%.
But in any case, I can made two conclusions from this results:

1. Modern platforms are mostly limited by memory access time, number of performed instructions is less critical.
This is why extra processing needed for nullable attributes can not significantly affect performance.
2. For large number of attributes JIT-ing of deform tuple can improve speed up to two time. Which is quite good result from my point of view.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: Cached/global query plans, autopreparation
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join