Re: JIT compiling with LLVM v10.1

Поиск

Список

Период

Сортировка

От	Konstantin Knizhnik
Тема	Re: JIT compiling with LLVM v10.1
Дата	15 февраля 2018 г. 14:59:46
Msg-id	5a18282d-89d0-ba21-4d54-bc2259ad7f26@postgrespro.ru обсуждение исходный текст
Ответ на	Re: JIT compiling with LLVM v10.1 (Andres Freund <andres@anarazel.de>)
Ответы	Re: JIT compiling with LLVM v10.1
Список	pgsql-hackers

Дерево обсуждения

On 14.02.2018 21:17, Andres Freund wrote:

Hi,

On 2018-02-07 06:54:05 -0800, Andres Freund wrote:

I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.

The layout is now as follows:

src/backend/jit/jit.c:   Part of JITing always linked into the server. Supports loading the   LLVM using JIT library.

src/backend/jit/llvm/
Infrastructure:llvmjit.c:   General code generation and optimization infrastructurellvmjit_error.cpp, llvmjit_wrap.cpp:   Error / backward compat wrappersllvmjit_inline.cpp:   Cross module inlining support
Code-Gen: llvmjit_expr.c   Expression compilation llvmjit_deform.c   Deform compilation

I've pushed a revised version that hopefully should address Jeff's
wish/need of being able to experiment with this out of core. There's now
a "jit_provider" PGC_POSTMASTER GUC that's by default set to
"llvmjit". llvmjit.so is the .so implementing JIT using LLVM. It fills a
set of callbacks via
extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
which can also be implemented by any other potential provider.

The other two biggest changes are that I've added a README
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/backend/jit/README;hb=jit
and that I've revised the configure support so it does more error
checks, and moved it into config/llvm.m4.

There's a larger smattering of small changes too.

I'm pretty happy with how the separation of core / shlib looks now. I'm
planning to work on cleaning and then pushing some of the preliminary
patches (fixed tupledesc, grouping) over the next few days.

Greetings,

Andres Freund

I have made some more experiments with efficiency of JIT-ing of deform tuple and I want to share this results (I hope that them will be interesting).
It is well known fact that Postgres spends most of the time in sequence scan queries for warm data in deforming tuples (17% in case of TPC-H Q1).
Postgres tries to optimize access to the tuple by caching fixed size offsets to the fields whenever possible and loading attributes on demand.
It is also well know recommendation to put fixed size, non-null, frequently used attributes at the beginning of table's attribute list to make this optimization work more efficiently.
You can see in the code of heap_deform_tuple shows that first NULL value will switch it to "slow" mode:

for (attnum = 0; attnum < natts; attnum++)
    {
        Form_pg_attribute thisatt = TupleDescAttr(tupleDesc, attnum);

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }

I tried to investigate importance of this optimization and what is actual penalty of "slow" mode.
At the same time I want to understand how JIT help to speed-up tuple deforming.

I have populated with data three tables:

create table t1(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t2(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t3(id integer primary key,c1 integer not null,c2 integer not null,c3 integer not null,c4 integer not null,c5 integer not null,c6 integer not null,c7 integer not null,c8 integer not null,c9 integer not null);
insert into t1 (id,c1,c2,c3,c4,c5,c6,c7,c8) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t2 (id,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t3 (id,c1,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0,0);
vacuum analyze t1;
vacuum analyze t2;
vacuum analyze t3;

t1 contains null in last c9 column, t2 - in first c1 columns and t3 has all attributes declared as not-null (and JIT can use this knowledge to generate more efficient deforming code).
All data set is hold in memory (shared buffer size is greater than database size) and I intentionally switch off parallel execution to make results more deterministic.
I run two queries calculating aggregates on one/all not-null fields:

select sum(c8) from t*;
select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;

As expected 35% time was spent in heap_deform_tuple.
But results (msec) were slightly confusing and unexected:

select sum(c8) from t*;

	w/o JIT	with JIT
t1	763	563
t2	772	570
t3	776	592

select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;

	w/o JIT	with JIT
t1	1239	742
t2	1233	747
t3	1255	803

I repeat each query 10 times and take the minimal time ( I think that it is more meaningful than average time which depends on some other activity on the system).

So there is no big difference between "slow" and "fast" ways of deforming tuple.
Moreover, for sometimes "slow" case is faster. Although I have to say that variance of results is quite large: about 10%.
But in any case, I can made two conclusions from this results:

1. Modern platforms are mostly limited by memory access time, number of performed instructions is less critical.
This is why extra processing needed for nullable attributes can not significantly affect performance.
2. For large number of attributes JIT-ing of deform tuple can improve speed up to two time. Which is quite good result from my point of view.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Konstantin Knizhnik
Дата: 15 февраля 2018 г., 14:20:00
Сообщение: Re: Cached/global query plans, autopreparation

Следующее

От: Amit Langote
Дата: 15 февраля 2018 г., 15:11:20
Сообщение: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: JIT compiling with LLVM v10.1

Предыдущее

Следующее