Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index
Дата
Msg-id CAEepm=2h8JDu0Ym_XYJ57S+4MOL+c1qredP9-ByUHwV9M_4b3w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index
Список pgsql-hackers
On Tue, Feb 13, 2018 at 4:28 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Jan 28, 2018 at 7:28 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Thu, Jan 25, 2018 at 7:29 PM, Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>>>> As Shubham seems to be running out of time, I thought of helping him
>>>> by looking into the above-suggested idea.  I think one way to lock a
>>>> particular hash value is we can use TID of heap tuple associated with
>>>> each index entry (index entry for the hash index will be hash value).
>>>
>>>
>>> Sorry, I didn't get what do you particularly mean.  If locking either TID of
>>> associated heap tuple or TID of hash index tuple, then what will we lock
>>> in the case when nothing found?  Even if we found nothing, we have
>>> to place some lock according to search key in order to detect cases when
>>> somebody has inserted the row which we might see according to that search
>>> key.
>>>
>>
>> Okay, but if you use hash value as lock tag (which is possible) how
>> will we deal with things like page split?  I think even if use
>> blocknumber/page/bucketnumber corresponding to the hash value along
>> with hash value in lock tag, then also it doesn't appear to work.  I
>> think using page level locks for index makes sense, especially because
>> it will be convinient to deal with page splits.
>>
>
> What I intend to say here is that we already have a mechanism like
> PredicateLockPageSplit() which can deal with predicate locks during
> page split if we operate at page level.  However, if we want to go for
> has value locking technique, it can be quite complex and expensive to
> make it work as we have to search all the locks that belong to the
> bucket being split and then move them for the new bucket.

True.

One way to avoid all that might be to use pseudo page numbers that
don't suffer from splits.   I don't know how you'd choose the
constant, but it could be something like pseudo page number = hash
value % 1024.  In other words, you'd use the full hash value for the
'tuple' part of the predicate lock tag, and a shorter hash value for
the 'page' part of the predicate lock tag, so you'd never have to
handle split, and promotion from fine grained 'tuple' (= hash value)
locks to coarse grained 'page' = (short hash value) locks would still
work automatically when space runs out.

> Alexander/Thomas, do you have better ideas to make it work, otherwise,
> I think we can proceed to review the Shubham's current approach/patch?

I think we should proceed to review the current patch.  As far as I
can see, adding more fine-grained locking would remain a possibility
for future improvement.  With this patch we get page-level hash index
SIREAD locks, and perhaps in a future patch we could add fine grained
hash value SIREAD locks (maybe as described above, if that works...)

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: A space-efficient, user-friendly way to store categorical data
Следующее
От: Rafia Sabih
Дата:
Сообщение: Re: [HACKERS] Partition-wise aggregation/grouping