Re: Questions about indexes?

Поиск
Список
Период
Сортировка
От Curt Sampson
Тема Re: Questions about indexes?
Дата
Msg-id Pine.NEB.4.51.0302180021120.997@angelic-vtfw.cvpn.cynic.net
обсуждение исходный текст
Ответ на Questions about indexes?  (Ryan Bradetich <rbradetich@uswest.net>)
Ответы Re: Questions about indexes?  (Kevin Brown <kevin@sysexperts.com>)
Список pgsql-hackers
On Mon, 16 Feb 2003, Ryan Bradetich wrote:

> I am not sure why all the data is duplicated in the index ...

Well, you have to have the full key in the index, or how would you know,
when you look at a particular index item, if it actually matches what
you're searching for?

MS SQL server does have an interesting option that would help you a lot
in this case: clustered indexes. A table may have a single clustered
index, and each leaf node of the index stores not just the key but
actually the entire row. Thus, in a case like yours, you'd store the row
only once, not twice.

Without thinking too hard about it (my usual mode of operation on this
list :-)) this could probably be implemented in postgresql. But I don't
think it would be entirely trivial, and your case is unusual enough
that I very much doubt whether it would be worth implementing to fix
that alone. It would also offer the advantage that any lookup using the
clustered index would save fetching the heap page after that as well,
but it's hard to say if the savings would be worth the work.

> Since my only requirement is that the rows be unique, I have developed a
> custom MD5 function in C, and created an index on the MD5 hash of the
> concatanation of all the fields.

Well, that won't guarantee uniqueness, since it's perfectly possible
to have two different rows hash to the same value. (If that weren't
possible, your hash would have to contain as much information as the row
itself, and your space savings wouldn't be nearly so dramatic.)

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christoph Haller
Дата:
Сообщение: Re: IpcSemaphoreKill: ...) failed: Invalid argument
Следующее
От: Tom Lane
Дата:
Сообщение: Re: client_encoding directive is ignored in