IMCS: In Memory Columnar Store for PostgreSQL

Поиск

Список

Период

Сортировка

От	knizhnik
Тема	IMCS: In Memory Columnar Store for PostgreSQL
Дата	3 января 2014 г. 15:21:31
Msg-id	52C59858.9090500@garret.ru обсуждение исходный текст
Список	pgsql-announce

Дерево обсуждения

I want to announce implementation of In-Memory Columnar Store extension
for PostgreSQL.
Vertical representation of data is stored in PostgreSQL shared memory.
Various basic and sophisticated analytic operators are provided for
manipulation with timeseries.

       GitHub repository: https://github.com/knizhnik/imcs/
       Documentation: http://www.garret.ru/imcs/user_guide.html
       Sources: http://www.garret.ru/imcs-1.02.tar.gz

Columnar store manager stores data tables as sections of columns of data
rather than as rows of data.
Most of traditional DBMS-es  store data in rows ("horizontally"): all
record attributes are stored together.
Such approach allows to load the whole record using one read operation
which usually leads to better performance for OLTP
queries (which access or update single records). But OLAP queries are
mostly performing operations on individual columns,
for example calculating sum or average of some column. In this case
vertical data representation, when data for each column
is stored independently, is more efficient. There are several DBMS-es in
marker which are based on vertical model: Vertica,
SciDB,... Also most of mainstream commercial databases also provide OLAP
extensions based on vertical storage:
Blue Acceleration for DB2, Oracle Database In-Memory Option, Microsoft
SQL server column store...

Columnar store or vertical representation of data allows to achieve
better performance in comparison with classical horizontal
representation due to three factors:
* Reducing size of fetched data: only columns involved in query are
accessed.
* Vector operations. Applying an operator to set of values (tile) makes
it possible to minimize interpretation cost.
Also SIMD instructions of modern processors accelerate execution of
vector operations.
* Compression of data. Certainly compression can also be used for all
the records, but independent compression of each column can give much
better results without significant extra CPU overhead. For example such
simple compression algorithm like RLE
(run-length-encoding) allows not only to reduce used space, but also
minimize number of performed operations.

В списке pgsql-announce по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

IMCS: In Memory Columnar Store for PostgreSQL