Re: A better way than tweaking NTUP_PER_BUCKET

Поиск

Список

Период

Сортировка

От	Atri Sharma
Тема	Re: A better way than tweaking NTUP_PER_BUCKET
Дата	26 июня 2013 г. 19:02:12
Msg-id	CAOeZVidn3YE+stkYeYbS-uZdj=N-Sdf2zHJ1gb_=DVD7d9o1Dg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: A better way than tweaking NTUP_PER_BUCKET (Stephen Frost <sfrost@snowman.net>)
Ответы	Re: A better way than tweaking NTUP_PER_BUCKET (Stephen Frost <sfrost@snowman.net>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Jun 26, 2013 at 9:20 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Atri Sharma (atri.jiit@gmail.com) wrote:
>> My point is that I would like to help in the implementation, if possible. :)
>
> Feel free to go ahead and implement it..  I'm not sure when I'll have a
> chance to (probably not in the next week or two anyway).  Unfortunately,
> the bigger issue here is really about testing the results and
> determining if it's actually faster/better with various data sets
> (including ones which have duplicates).  I've got one test data set
> which has some interesting characteristics (for one thing, hashing the
> "large" side and then seq-scanning the "small" side is actually faster
> than going the other way, which is quite 'odd' imv for a hashing
> system): http://snowman.net/~sfrost/test_case2.sql
>
> You might also look at the other emails that I sent regarding this
> subject and NTUP_PER_BUCKET.  Having someone confirm what I saw wrt
> changing that parameter would be nice and it would be a good comparison
> point against any kind of pre-filtering that we're doing.
>
> One thing that re-reading the bloom filter description reminded me of is
> that it's at least conceivable that we could take the existing hash
> functions for each data type and do double-hashing or perhaps seed the
> value to be hashed with additional data to produce an "independent" hash
> result to use.  Again, a lot of things that need to be tested and
> measured to see if they improve overall performance.

Right, let me look.Although, I am pretty busy atm with ordered set
functions, so will get it done maybe last week of this month.

Another thing I believe in is that we should have multiple hashing
functions for bloom filters, which generate different probability
values so that the coverage is good.

Regards,

Atri

--
Regards,

Atri
l'apprenant

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Markus Wanner
Дата: 26 июня 2013 г., 18:57:16
Сообщение: Re: Hash partitioning.

Следующее

От: Amit Langote
Дата: 26 июня 2013 г., 19:03:31
Сообщение: Re: Computer VARSIZE_ANY(PTR) during debugging

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: A better way than tweaking NTUP_PER_BUCKET

Предыдущее

Следующее