Re: alternative back-end block formats

Поиск
Список
Период
Сортировка
От Christian Convey
Тема Re: alternative back-end block formats
Дата
Msg-id CAPfS4ZzwxnQuYjEBnmd0eiYW3t85o4YOvGXfqK=AcNOgKc77rQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: alternative back-end block formats  (Craig Ringer <craig@2ndquadrant.com>)
Ответы Re: alternative back-end block formats  (Cédric Villemain <cedric@2ndquadrant.com>)
Список pgsql-hackers
Hi Craig,

On Sun, Jan 26, 2014 at 5:47 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 01/21/2014 07:43 PM, Christian Convey wrote:
> Hi all,
>
> I'm playing around with Postgres, and I thought it might be fun to
> experiment with alternative formats for relation blocks, to see if I can
> get smaller files and/or faster server performance.

It's not clear how you'd do this without massively rewriting the guts of Pg.

Per the docs on internal structure, Pg has a block header, then tuples
within the blocks, each with a tuple header and list of Datum values for
the tuple. Each Datum has a generic Datum header (handling varlena vs
fixed length values etc) then a type-specific on-disk representation
controlled by the type output function for that type.

I'm still in the process of getting familiar with the pg backend code, so I don't have a concrete plan yet.  However, I'm working on the assumption that some set of macros and functions encapsulates the page layout.  

If/when I tackle this, I expect to add a layer of indirection somewhere around that boundary, so that some non-catalog tables, whose schemas meet certain simplifying assumptions, are read and modified using specialized code.
 
I don't want to get into the specific optimizations I'd like to try, only because I haven't fully studied the code yet, so I don't want to put my foot in my mouth.

What concrete problem do you mean to tackle? What idea do you want to
explore or implement?

My real motivation is that I'd like to get more familiar with the pg backend codebase, and tilting at this windmill seemed like an interesting way to accomplish that.

If I was focused on really solving a real-world problem, I'd say that this lays the groundwork for table-schema-specific storage optimizations and optimized record-filtering code.  But I'd only make that argument if I planned to (a) perform a careful study with statistically significant benchmarks, and/or (b) produce a merge-worthy patch.  At this point I have no intentions of doing so.  My main goal really is just to have fun with the code.


> Does anyone know if this has been done before with Postgres?  I would
> have assumed yes, but I'm not finding anything in Google about people
> having done this.

AFAIK (and I don't know much in this area) the storage manager isn't
very pluggable compared to the rest of Pg.

Thanks for the warning.  Duly noted.

Kind regards,
Christian

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: [PATCH] Support for pg_stat_archiver view
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: Standalone synchronous master