Re: prefix btree implementation

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: prefix btree implementation
Дата
Msg-id 200510071404.j97E4Fu20249@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: prefix btree implementation  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
OK, TODO updated:

* Consider compressing indexes by storing key values duplicated in several rows as a single index entry
 This is difficult because it requires datatype-specific knowledge.


---------------------------------------------------------------------------

Simon Riggs wrote:
> On Thu, 2005-10-06 at 22:43 -0400, Bruce Momjian wrote:
> > Jim C. Nasby wrote:
> > > On Wed, Oct 05, 2005 at 03:40:43PM -0700, Qingqing Zhou wrote:
> > > > We do the prefix sharing when we build up index only, never on the fly.
> > > 
> > > So are you saying that inserts of new data wouldn't make any use of
> > > this? ISTM that greatly reduces the usefulness, though I'm not objecting
> > > because compression during build is probably better than none at all. Is
> > > there a technical reason compression can't be used during normal
> > > operations?
> > 
> > Added to TODO:
> > 
> > * Consider compressing indexes by storing key prefix values shared by
> >   several rows as a single index entry
> 
> Just to re-iterate Tom's point. There isn't any easy way of doing this
> in an object-relational database where we can make almost no assumptions
> about particular datatypes. There is no definition of datatype prefix...
> The best you could do would be to introduce a new API that allows a
> datatype to provide a shorter prefix value when given a starting data
> value, but then you'd need to write a whole set of prefixing functions.
> But that is almost identical to the idea of functional indexes anyhow,
> so I see no value in providing a second way of doing this when the
> existing way can be more easily modified to do this. 
> 
> I do not think key prefixing should be on the TODO for PostgreSQL,
> unless we also agree that datatype independence should not be maintained
> in all cases and can be relaxed for built-in datatypes.
> 
> I suggest we reword that to "Investigate techniques for compressing
> indexes that will work successfully with datatype independence".
> 
> What we might consider is having the index store a chain of pointers as
> an array on the index row, rather than having each row map to one index
> row. That technique would considerably reduce index volume for non-
> unique indexes by removing lots of index tuple overhead. But you would
> need to keep at most N pointers on a row. This technique is used by
> Teradata's Non Unique Secondary Index design.
> 
> Best Regards, Simon Riggs
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: mark@mark.mielke.cc
Дата:
Сообщение: Re: Vote needed: revert beta2 changes or not?
Следующее
От: Emil Briggs
Дата:
Сообщение: Re: Some spinlock patch tests