IMCS: In Memory Columnar Store for PostgreSQL

Поиск
Список
Период
Сортировка
От knizhnik
Тема IMCS: In Memory Columnar Store for PostgreSQL
Дата
Msg-id 52C59858.9090500@garret.ru
обсуждение исходный текст
Список pgsql-announce
I want to announce implementation of In-Memory Columnar Store extension
for PostgreSQL.
Vertical representation of data is stored in PostgreSQL shared memory.
Various basic and sophisticated analytic operators are provided for
manipulation with timeseries.

       GitHub repository: https://github.com/knizhnik/imcs/
       Documentation: http://www.garret.ru/imcs/user_guide.html
       Sources: http://www.garret.ru/imcs-1.02.tar.gz

Columnar store manager stores data tables as sections of columns of data
rather than as rows of data.
Most of traditional DBMS-es  store data in rows ("horizontally"): all
record attributes are stored together.
Such approach allows to load the whole record using one read operation
which usually leads to better performance for OLTP
queries (which access or update single records). But OLAP queries are
mostly performing operations on individual columns,
for example calculating sum or average of some column. In this case
vertical data representation, when data for each column
is stored independently, is more efficient. There are several DBMS-es in
marker which are based on vertical model: Vertica,
SciDB,... Also most of mainstream commercial databases also provide OLAP
extensions based on vertical storage:
Blue Acceleration for DB2, Oracle Database In-Memory Option, Microsoft
SQL server column store...

Columnar store or vertical representation of data allows to achieve
better performance in comparison with classical horizontal
representation due to three factors:
* Reducing size of fetched data: only columns involved in query are
accessed.
* Vector operations. Applying an operator to set of values (tile) makes
it possible to minimize interpretation cost.
Also SIMD instructions of modern processors accelerate execution of
vector operations.
* Compression of data. Certainly compression can also be used for all
the records, but independent compression of each column can give much
better results without significant extra CPU overhead. For example such
simple compression algorithm like RLE
(run-length-encoding) allows not only to reduce used space, but also
minimize number of performed operations.



В списке pgsql-announce по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: == PostgreSQL Weekly News - December 29 2013 ==
Следующее
От: David Fetter
Дата:
Сообщение: == PostgreSQL Weekly News - January 05 2014 ==