Fwd: Vitesse DB call for testing

Поиск

Список

Период

Сортировка

От	Feng Tian
Тема	Fwd: Vitesse DB call for testing
Дата	17 октября 2014 г. 18:08:54
Msg-id	CAFWGqnuUGrjomoaNHxSzyvgNFxH6dYLzzmrWydkWCp_guuhkcA@mail.gmail.com обсуждение исходный текст
Ответ на	Vitesse DB call for testing (CK Tan <cktan@vitessedata.com>)
Ответы	Re: Vitesse DB call for testing
Список	pgsql-hackers

Дерево обсуждения

Hi, Tom,

Sorry for double post to you.

Feng

---------- Forwarded message ----------
From: Feng Tian <ftian@vitessedata.com>
Date: Fri, Oct 17, 2014 at 10:29 AM
Subject: Re: [HACKERS] Vitesse DB call for testing
To: Tom Lane <tgl@sss.pgh.pa.us>

Hi, Tom,

I agree using that using int128 in stock postgres will speed up things too. On the other hand, that is only one part of the equation. For example, if you look at TPCH Q1, the int128 "cheating" does not kick in at all, but we are 8x faster.

I am not sure why do you mean by "actual data access". Data is still in stock postgres format on disk. We indeed jit-ed all data fields access (deform tuple). To put things in perspective, I just timed select count(*) and select count(l_orderkey) from tpch1.lineitem; Our code is bottlenecked by memory bandwidth and difference is pretty much invisible.

Thanks,

Feng

ftian=# set vdb_jit = 0;

SET

Time: 0.155 ms

ftian=# select count(*) from tpch1.lineitem;

count

---------

6001215

(1 row)

Time: 688.658 ms

ftian=# select count(*) from tpch1.lineitem;

count

---------

6001215

(1 row)

Time: 690.753 ms

ftian=# select count(l_orderkey) from tpch1.lineitem;

count

---------

6001215

(1 row)

Time: 819.452 ms

ftian=# set vdb_jit = 1;

SET

Time: 0.167 ms

ftian=# select count(*) from tpch1.lineitem;

count

---------

6001215

(1 row)

Time: 203.543 ms

ftian=# select count(l_orderkey) from tpch1.lineitem;

count

---------

6001215

(1 row)

Time: 202.253 ms

ftian=#

On Fri, Oct 17, 2014 at 10:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

CK Tan <cktan@vitessedata.com> writes:
> The bigint sum,avg,count case in the example you tried has some optimization. We use int128 to accumulate the bigint instead of numeric in pg. Hence the big speed up. Try the same query on int4 for the improvement where both pg and vitessedb are using int4 in the execution.

Well, that's pretty much cheating: it's too hard to disentangle what's
coming from JIT vs what's coming from using a different accumulator
datatype. If we wanted to depend on having int128 available we could
get that speedup with a couple hours' work.

But what exactly are you "compiling" here? I trust not the actual data
accesses; that seems far too complicated to try to inline.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Fwd: Vitesse DB call for testing