Re: [HACKERS] [WIP]Vertical Clustered Index (columnar storeextension)

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: [HACKERS] [WIP]Vertical Clustered Index (columnar storeextension)
Дата
Msg-id fa4e46a1-d6ee-723d-c3ca-c381bb7d91e9@BlueTreble.com
обсуждение исходный текст
Ответ на [HACKERS] [WIP]Vertical Clustered Index (columnar store extension)  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Ответы Re: [HACKERS] [WIP]Vertical Clustered Index (columnar store extension)  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Список pgsql-hackers
On 12/29/16 9:55 PM, Haribabu Kommi wrote:
> The tuples which don't have multiple copies or frozen data will be moved
> from WOS to ROS periodically by the background worker process or autovauum
> process. Every column data is stored separately in it's relation file. There
> is no transaction information is present in ROS. The data in ROS can be
> referred with tuple ID.

Would updates be handled via the delete mechanism you described then?

> In this approach, the column data is present in both heap and columnar
> storage.

ISTM one of the biggest reasons to prefer a column store over heap is to 
ditch the 24 byte overhead, so I'm not sure how much of a win this is.

Another complication is that one of the big advantages of a CSTORE is 
allowing analysis to be done efficiently on a column-by-column (as 
opposed to row-by-row) basis. Does your patch by chance provide that?

Generally speaking, I do think the idea of adding support for this as an 
"index" is a really good starting point, since that part of the system 
is pluggable. It might be better to target getting only what needs to be 
in core into core to begin with, allowing the other code to remain an 
extension for now. I think there's a lot of things that will be 
discovered as we start moving into column stores, and it'd be very 
unfortunate to accidentally paint the core code into a corner somewhere.

As a side note, it's possible to get a lot of the benefits of a column 
store by using arrays. I've done some experiments with that and got an 
80-90% space reduction, and most queries saw improved performance as 
well (there were a few cases that weren't better). The biggest advantage 
to this approach is people could start using it today, on any recent 
version of Postgres. That would be a great way to gain knowledge on what 
users would want to see in a column store, something else I suspect we 
need. It would also be far less code than what you or Alvaro are 
proposing. When it comes to large changes that don't have crystal-clear 
requirements, I think that's really important.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] merging some features from plpgsql2 project
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: [HACKERS] ICU integration