Re: 10+hrs vs 15min because of just one index

Поиск
Список
Период
Сортировка
От Aaron Turner
Тема Re: 10+hrs vs 15min because of just one index
Дата
Msg-id 1ca1c1410602121133y279ea488nc38f049ac6ebc7f4@mail.gmail.com
обсуждение исходный текст
Ответ на Re: 10+hrs vs 15min because of just one index  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: 10+hrs vs 15min because of just one index  (PFC <lists@peufeu.com>)
Список pgsql-performance
On 2/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Aaron Turner <synfinatic@gmail.com> writes:
> > Well before I go about re-architecting things, it would be good to
> > have a strong understanding of just what is going on.  Obviously, the
> > unique index on the char(48) is the killer.  What I don't know is:
>
> You have another unique index on the integer primary key, so it's not
> the mere fact of a unique index that's hurting you.

Understood.  I just wasn't sure if in general unique indexes are some
how more expensive then non-unique indexes.

> > 1) Is this because the column is so long?
>
> Possibly.  Allowing for 12 bytes index-entry overhead, the char keys
> would be 60 bytes vs 16 for the integer column, so this index is
> physically almost 4x larger than the other.  You might say "but that
> should only cause 4x more I/O" but it's not necessarily so.  What's
> hard to tell is whether you are running out of RAM disk cache space,
> resulting in re-reads of pages that could have stayed in memory when
> dealing with one-fifth as much index data.  You did not show us the
> iostat numbers for the two cases, but it'd be interesting to look at
> the proportion of writes to reads on the data drive in both cases.

Sounds a lot like what Marc mentioned.

> > 2) Is this because PG is not optimized for char(48) (maybe it wants
> > powers of 2? or doesn't like even numbers... I don't know, just
> > throwing it out there)
>
> Are the key values really all 48 chars long?  If not, you made a
> bad datatype choice: varchar(n) (or even text) would be a lot
> smarter.  char(n) wastes space on blank-padding.

Yep, everything exactly 48.   Looks like I'll be storing it as a bytea
in the near future though.

> The only one of these effects that looks to me like it could result in
> worse-than-linear degradation of I/O demand is maxing out the available
> RAM for disk cache.  So while improving the datatype choice would
> probably be worth your while, you should first see if fooling with
> shared_buffers helps, and if not it's time to buy RAM not disk.

Yeah, that's what it's beginning to sound like.  Thanks Tom.

--
Aaron Turner
http://synfin.net/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Aaron Turner
Дата:
Сообщение: Re: 10+hrs vs 15min because of just one index
Следующее
От: "Adnan DURSUN"
Дата:
Сообщение: SQL Function Performance