Re: [POC] A better way to expand hash indexes.

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [POC] A better way to expand hash indexes.
Дата
Msg-id CAA4eK1L=gE+YW1OcZiUbmnboapVvZJu0jJp7Su7oqZE6pjVKvA@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] [POC] A better way to expand hash indexes.  (Mithun Cy <mithun.cy@enterprisedb.com>)
Ответы Re: [POC] A better way to expand hash indexes.  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Sun, Mar 26, 2017 at 11:26 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
> Thanks, Amit for the review.
> On Sat, Mar 25, 2017 at 7:03 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> I think one-dimensional patch has fewer places to touch, so that looks
>> better to me.  However, I think there is still hard coding and
>> assumptions in code which we should try to improve.
>
> Great!, I will continue with spares 1-dimensional improvement.
>

@@ -563,18 +563,20 @@ _hash_init_metabuffer(Buffer buf, double
num_tuples, RegProcedure procid,\
{
.. else
- num_buckets = ((uint32) 1) << _hash_log2((uint32) dnumbuckets);
+ num_buckets = _hash_get_totalbuckets(_hash_spareindex(dnumbuckets));
..
..
- metap->hashm_maxbucket = metap->hashm_lowmask = num_buckets - 1;
- metap->hashm_highmask = (num_buckets << 1) - 1;
+ metap->hashm_maxbucket = num_buckets - 1;
+
+ /* set hishmask, which should be sufficient to cover num_buckets. */
+ metap->hashm_highmask = (1 << (_hash_log2(num_buckets))) - 1;
+ metap->hashm_lowmask = (metap->hashm_highmask >> 1);
}

I think we can't change the number of buckets to be created or lowmask
and highmask calculation here without modifying _h_spoolinit() because
it sorts the data to be inserted based on hashkey which in turn
depends on the number of buckets that we are going to create during
create index operation.  We either need to allow create index
operation to still always create buckets in power-of-two fashion or we
need to update _h_spoolinit according to new computation.  One minor
drawback of using power-of-two scheme for creation of buckets during
create index is that it can lead to wastage of space and will be
inconsistent with what the patch does during split operation.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rafia Sabih
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Improve access to parallel queryfrom procedural languages.
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [sqlsmith] Unpinning error in parallel worker