Re: [HACKERS] VOPS: vectorized executor for Postgres: how to speedupOLAP queries more than 10 times without changing anything in Postgres executor

Поиск
Список
Период
Сортировка
От Thom Brown
Тема Re: [HACKERS] VOPS: vectorized executor for Postgres: how to speedupOLAP queries more than 10 times without changing anything in Postgres executor
Дата
Msg-id CAA-aLv5k0y6NzzD6j+M-jW=WbW7xbFCfXGB1DK1zdZ2-E_+iAQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] VOPS: vectorized executor for Postgres: how to speedupOLAP queries more than 10 times without changing anything in Postgresexecutor  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers
On 16 February 2017 at 17:00, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> More progress in vectorized Postgres extension (VOPS). It is not required
> any more to use some special functions in queries.
> You can use vector operators in query with standard SQL and still get ten
> times improvement on some queries.
> VOPS extension now uses post parse analyze hook to transform query.
> I really impressed by flexibility and extensibility of Postgres type system.
> User defined types&operatpors&casts do most of the work.
>
> It is still responsibility of programmer or database administrator to create
> proper projections
> of original table. This projections need to use tiles types for some
> attributes (vops_float4,...).
> Then you can query this table using standard SQL. And this query will be
> executed using vector operations!
>
> Example of such TPC-H queries:
>
> Q1:
> select
>     l_returnflag,
>     l_linestatus,
>     sum(l_quantity) as sum_qty,
>     sum(l_extendedprice) as sum_base_price,
>     sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
>     sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
>     avg(l_quantity) as avg_qty,
>     avg(l_extendedprice) as avg_price,
>     avg(l_discount) as avg_disc,
>     count(*) as count_order
> from
>     vops_lineitem_projection
> where
>     l_shipdate <= '1998-12-01'::date
> group by
>     l_returnflag,
>     l_linestatus
> order by
>     l_returnflag,
>     l_linestatus;
>
>
>
> Q6:
> select
>     sum(l_extendedprice*l_discount) as revenue
> from
>     lineitem_projection
> where
>     l_shipdate between '1996-01-01'::date and '1997-01-01'::date
>     and l_discount between 0.08 and 0.1
>     and l_quantity < 24;

>
> On 13.02.2017 17:12, Konstantin Knizhnik wrote:
>>
>> Hello hackers,
>>
>> There were many discussions concerning  possible ways of speeding-up
>> Postgres. Different approaches were suggested:
>>
>> - JIT (now we have three different prototype implementations based on
>> LLVM)
>> - Chunked (vectorized) executor
>> - Replacing pull with push
>> - Columnar store (cstore_fdw, IMCS)
>> - Optimizing and improving current executor (reducing tuple deform
>> overhead, function call overhead,...)
>>
>> Obviously the best result can be achieved in case of combining all this
>> approaches. But actually them are more or less interchangeable: vectorized
>> execution is not eliminating interpretation overhead, but it is divided by
>> vector size and becomes less critical.
>>
>> I decided to write small prototype to estimate possible speed improvement
>> of vectorized executor. I created special types representing "tile" and
>> implement standard SQL operators for them. So neither Postgres planer,
>> nether Postgres executor, nether Postgres heap manager are changed. But I
>> was able to reach more than 10 times speed improvement on TPC-H Q1/Q6
>> queries!

Impressive work!

Thom



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] duplicate "median" entry in doc