Re: Does max size of varchar influence index size

Список
Период
Сортировка
От Mark Roberts
Тема Re: Does max size of varchar influence index size
Дата
Msg-id 1214857494.6049.76.camel@localhost
обсуждение исходный текст
Ответ на Does max size of varchar influence index size  (Franck Routier)
Ответы Re: Does max size of varchar influence index size  (Franck Routier)
Список pgsql-performance
Дерево обсуждения
Does max size of varchar influence index size  (Franck Routier, )
 Re: Does max size of varchar influence index size  (Mark Roberts, )
  Re: Does max size of varchar influence index size  (Franck Routier, )
   Re: Does max size of varchar influence index size  (Richard Huxton, )
On Mon, 2008-06-30 at 18:57 +0200, Franck Routier wrote:
> Hi,
>
> I have problems with my database becoming huge in size (around 150 GB
> right now, and 2/3 for only three tables, each having around 30 millions
> tuples. Space is spent mainly on indices.).
>
> I have a lot of multi-column varchar primary keys (natural keys), and
> lot of foreign keys on these tables and thus a lot of indices.
>
> When using VARCHAR, we defaulted to VARCHAR(32) (because on _some_ of
> the identifiers, we have to apply md5).
>
> We assumed that using VARCHAR(32) but having values at most 4 characters
> long (for example) wouldn't influence indices size, ie it would be the
> same as using VARCHAR(4) to keep the example.
>
> Now I really doubt if we were right :)
>
> So, what should we expect ? And are there other factors influencing
> indices size ?
>
> Thanks,
> Franck

Is there any particular reason that you're not using a surrogate key?  I
found that switching from natural to surrogate keys in a similar
situation made the indexes not only smaller, but faster.

It really only became an issue after our individual tables got larger
than 20-25G, but I think we got lucky and headed the issue off at the
pass.

I think it should be fairly trivial* to set up a test case using
pg_total_relation_size() to determine whether your suspicions are
correct.

-Mark

* It may not be as trivial as I say, or I'd have done it in the 5
minutes it took to write this email.


В списке pgsql-performance по дате отправления:

Предыдущее
От: John Beaver
Дата:
Сообщение: Re: sequence scan problem
Следующее
От: "Emiliano Leporati"
Дата:
Сообщение: un-understood index performance behaviour