Re: prefix btree implementation
От | Bruce Momjian |
---|---|
Тема | Re: prefix btree implementation |
Дата | |
Msg-id | 200510071404.j97E4Fu20249@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: prefix btree implementation (Simon Riggs <simon@2ndquadrant.com>) |
Список | pgsql-hackers |
OK, TODO updated: * Consider compressing indexes by storing key values duplicated in several rows as a single index entry This is difficult because it requires datatype-specific knowledge. --------------------------------------------------------------------------- Simon Riggs wrote: > On Thu, 2005-10-06 at 22:43 -0400, Bruce Momjian wrote: > > Jim C. Nasby wrote: > > > On Wed, Oct 05, 2005 at 03:40:43PM -0700, Qingqing Zhou wrote: > > > > We do the prefix sharing when we build up index only, never on the fly. > > > > > > So are you saying that inserts of new data wouldn't make any use of > > > this? ISTM that greatly reduces the usefulness, though I'm not objecting > > > because compression during build is probably better than none at all. Is > > > there a technical reason compression can't be used during normal > > > operations? > > > > Added to TODO: > > > > * Consider compressing indexes by storing key prefix values shared by > > several rows as a single index entry > > Just to re-iterate Tom's point. There isn't any easy way of doing this > in an object-relational database where we can make almost no assumptions > about particular datatypes. There is no definition of datatype prefix... > The best you could do would be to introduce a new API that allows a > datatype to provide a shorter prefix value when given a starting data > value, but then you'd need to write a whole set of prefixing functions. > But that is almost identical to the idea of functional indexes anyhow, > so I see no value in providing a second way of doing this when the > existing way can be more easily modified to do this. > > I do not think key prefixing should be on the TODO for PostgreSQL, > unless we also agree that datatype independence should not be maintained > in all cases and can be relaxed for built-in datatypes. > > I suggest we reword that to "Investigate techniques for compressing > indexes that will work successfully with datatype independence". > > What we might consider is having the index store a chain of pointers as > an array on the index row, rather than having each row map to one index > row. That technique would considerably reduce index volume for non- > unique indexes by removing lots of index tuple overhead. But you would > need to keep at most N pointers on a row. This technique is used by > Teradata's Non Unique Secondary Index design. > > Best Regards, Simon Riggs > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
В списке pgsql-hackers по дате отправления: