Re: B-Tree support function number 3 (strxfrm() optimization)

Поиск
Список
Период
Сортировка
От Claudio Freire
Тема Re: B-Tree support function number 3 (strxfrm() optimization)
Дата
Msg-id CAGTBQpY4Qbunj+kYc5hRin3jWP4uovmTbgcKW-VY0LKxH9Ggxg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Mon, Jul 14, 2014 at 2:53 PM, Peter Geoghegan <pg@heroku.com> wrote:
> My concern is that it won't be worth it to do the extra work,
> particularly given that I already have 8 bytes to work with. Supposing
> I only had 4 bytes to work with (as researchers writing [2] may have
> only had in 1994), that would leave me with a relatively small number
> of distinct normalized keys in many representative cases. For example,
> I'd have a mere 40,665 distinct normalized keys in the case of my
> "cities" database, rather than 243,782 (out of a set of 317,102 rows)
> for 8 bytes of storage. But if I double that to 16 bytes (which might
> be taken as a proxy for what a good compression scheme could get me),
> I only get a modest improvement - 273,795 distinct keys. To be fair,
> that's in no small part because there are only 275,330 distinct city
> names overall (and so most dups get away with a cheap memcmp() on
> their tie-breaker), but this is a reasonably organic, representative
> dataset.


Are those numbers measured on MAC's strxfrm?

That was the one with suboptimal entropy on the first 8 bytes.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stefan Kaltenbrunner
Дата:
Сообщение: Re: SSL information view
Следующее
От: Robert Haas
Дата:
Сообщение: Re: pg_shmem_allocations view