A Silly Idea for Vertically-Oriented Databases

Поиск
Список
Период
Сортировка
От Avery Payne
Тема A Silly Idea for Vertically-Oriented Databases
Дата
Msg-id 46E1B9F4.30707@pcfruit.com
обсуждение исходный текст
Ответы Re: A Silly Idea for Vertically-Oriented Databases  (Josh Berkus <josh@agliodbs.com>)
Re: A Silly Idea for Vertically-Oriented Databases  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Be forewarned - this is probably a very long post, and I'm just a mere 
mortal (ie. admin) who doesn't write copious amounts of C code.  Take 
the following posts and suggestions with a grain of salt.

So I've been seeing/hearing all of the hoopla over vertical databases 
(column stores), and how they'll not only slice bread but also make 
toast, etc.  I've done some quick searches for past articles on 
"C-Store", "Vertica", "Column Store", and "Vertical Database", and have 
seen little discussion on this.  And then a funny thought occurs to me - 
when I look at the directory structure and file layout of a PostgreSQL 
database, I see that each OID corresponds to a table, which corresponds 
to (generally) a single file. Then I have a second funny thought - what 
if there was a low-friction, low-cost-of-implementation way to bring 
similar advantages to PostgreSQL without major alterations, recoding, 
etc?  Finally it occurs to me that PostgreSQL already does something 
similar but it could do it so much better, with only one language change 
and minor changes to the storage layout.  So here's my plum-crazy 
proposal (and I've made some before - see 
http://archives.postgresql.org/pgsql-odbc/2006-10/msg00040.php - and 
they not only made it into production, but they are in active use by me 
on a weekly basis - Thanks Hiroshi!!!), bear with me...

Make one small, very tiny syntactic change to "CREATE TABLE" that 
includes a new keyword, "COLUMN-STORE" or something similar.  I don't 
care where it appears as long as it's after the "CREATE TABLE".  You 
would not have to break any existing SQL conventions, PostgreSQL would 
continue to be SQL compliant, and given the odd wording, I highly doubt 
that the folks who work on SQL keywords will end up using it at any 
point in time.  If adding COLUMN-STORE is objectionable because it will 
"cloud the compliance of the language" then simply move the 
functionality into the table space functionality.  In hindsight, it 
might even belong there instead.  So, instead of specifying it by 
table,  create a table space that has an attribute "Column Storage" set 
as active.  When inactive, it uses the traditional "one-file-per-table" 
layout.

Make each table column capable of receiving an OID.  This will be 
critical for the following steps...

If a table is created with "COLUMN-STORE" as an option, then it will 
continue to behave in the same way it always has, but the storage will 
be different.  Each column in the table will be represented by a single 
file, with the file name being (naturally) the OID.  
INSERT/UPDATE/DELETE would function as it always has, etc. Nothing would 
change.  Except how the data is stored.  The existing TOAST mechanisms 
continue to work - because the engine would treat each file as a 
single-column table! 

One additional "column" would be added to the store, an invisible one 
that not only tracks the OID for the "rows" in this type of setup, but 
also the state of the row.  Let's call this the "Control Column".  Given 
that the metadata size for the row would be fixed/constant, we won't 
have to worry about what is in the other "columns" and "rows", they can 
be any size.  BTW, the "Control Column" would be just another column 
from the storage engine's point of view.  It just happens to be one that 
no-one can see, other than the database (and maybe the administrator).

When you go to VACUUM a table, you would treat each column as a 
single-row table, so if a row is a candidate for a VACUUM reclamation, 
then it will adjust each "column" an equal amount.  Under no 
circumstances would you have columns "out of sync", so if a record goes, 
it means each adjacent column goes with it.  This sounds disk-intensive 
at first, until you realize that the admin will have made a contentious 
decision to use this format, and understands the advantages/drawbacks to 
this method.  So it takes a little longer to VACUUM, I don't care, 
because as an admin I will have specified this layout for a reason - to 
do OLAP, not OLTP.  Which means, I rarely VACUUM it.  Add to this the 
high efficiency you would gain by packing more records into buffers per 
read, and most of the losses you take in re-reading data would really 
not amount to as big a loss as you might think.

DELETE would simply mark a row off as deleted in the "Control Column".  
If the storage engine needed to reclaim a row, it would not have to look 
any further than the "control column" to find an empty spot where it 
could overwrite data.

INSERT/UPDATE continue to work as they always have.  The storage engine 
would perceive each "column" as a single-column table, meaning that the 
existing TOAST mechanisms continue to work!   Nothing needs to change 
there.  The real change would be that the table's columns would be 
"split up" into individual updates, and the "Control Column" would be 
used to keep all of the records in sync.

Why bother with this?  Because, when you are said and done, you will 
find yourself with a rough equivelent of a column-store database, with 
all of the OLAP goodness that people are looking for.  You have little 
if any impact on the admin/users perception, other than a flag was 
checked somewhere and forgotten about in the database.  From the storage 
engine's perspective, you have many many many small 1-column tables to 
take care of, and they all update at the same "place" at the same "time" 
to keep the records in sync when you recompose a row.  TOAST and large 
object storage works the same as before, nothing changes, and that's as 
it should be.

All with what would be (hopefully) a minor change to the storage 
backend.  We're not talking about brain surgery on existing, tested 
code, but rather, a new feature that uses existing features in-place.  
As TOAST improves so does this feature.  As caching improves, so does 
the feature again.  And so on.

I've been in a bit of a hurry to blurt all of this out, and I'm sure 
that I've forgotten something along the way, so if you find something 
missing, please be patient- I had to write all of this in about 20 
minutes or less and I didn't have alot of time.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Decibel!
Дата:
Сообщение: Re: Oddity with psql \d and pg_table_is_visible
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: A Silly Idea for Vertically-Oriented Databases