Re: hash index improving v3

Поиск
Список
Период
Сортировка
От Alex Hunsaker
Тема Re: hash index improving v3
Дата
Msg-id 34d269d40809102049y2599a381kf409c0b5f18b6170@mail.gmail.com
обсуждение исходный текст
Ответ на Re: hash index improving v3  (Kenneth Marshall <ktm@rice.edu>)
Ответы Re: hash index improving v3  ("Alex Hunsaker" <badalex@gmail.com>)
Список pgsql-patches
On Wed, Sep 10, 2008 at 7:04 AM, Kenneth Marshall <ktm@rice.edu> wrote:
> On Tue, Sep 09, 2008 at 07:23:03PM -0600, Alex Hunsaker wrote:
>> On Tue, Sep 9, 2008 at 7:48 AM, Kenneth Marshall <ktm@rice.edu> wrote:
>> > I think that the glacial speed for generating a big hash index is
>> > the same problem that the original code faced.
>>
>> Yeah sorry, I was not saying it was a new problem with the patch.  Err
>> at least not trying to :) *Both* of them had been running at 18+ (I
>> finally killed them sometime Sunday or around +32 hours...)
>>
>> > It would be useful to have an equivalent test for the hash-only
>> > index without the modified int8 hash function, since that would
>> > be more representative of its performance. The collision rates
>> > that I was observing in my tests of the old and new mix() functions
>> > was about 2 * (1/10000) of what you test generated. You could just
>> > test against the integers between 1 and 2000000.
>>
>> Sure but then its pretty much just a general test of patch vs no
>> patch.  i.e. How do we measure how much longer collisions take when
>> the new patch makes things faster?  That's what I was trying to
>> measure... Though I apologize I don't think that was clearly stated
>> anywhere...
>
> Right, I agree that we need to benchmark the collision processing
> time difference. I am not certain that two data points is useful
> information. There are 469 collisions with our current hash function
> on the integers from 1 to 2000000. What about testing the performance
> at power-of-2 multiples of 500, i.e. 500, 1000, 2000, 4000, 8000,...
> Unless you adjust the fill calculation for the CREATE INDEX, I would
> stop once the time to create the index spikes. It might also be useful
> to see if a CLUSTER affects the performance as well. What do you think
> of that strategy?

Not sure it will be a good benchmark of collision processing.  Then
again you seem to have studied the hash algo closer than me.  Ill go
see about doing this.  Stay tuned.

В списке pgsql-patches по дате отправления:

Предыдущее
От: "Alex Hunsaker"
Дата:
Сообщение: Re: hash index improving v3
Следующее
От: "Alex Hunsaker"
Дата:
Сообщение: Re: hash index improving v3