Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)
Дата
Msg-id 4c512f2b-1f08-aa43-4087-78434f94ffeb@enterprisedb.com
обсуждение исходный текст
Ответ на Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On 7/9/23 20:05, Heikki Linnakangas wrote:
> On 09/07/2023 19:16, Tomas Vondra wrote:
>> On 7/8/23 23:57, Heikki Linnakangas wrote:
>>> The new preprocess support function feels a bit too inflexible to me.
>>> True, you can store whatever you want in the ScanKey that it returns,
>>> but since that's the case, why not just make it void * ? It seems that
>>> the constraint here was that you need to pass a ScanKey to the
>>> consistent function, because the consistent function's signature is what
>>> it is. But we can change the signature, if we introduce a new support
>>> amproc number for it.
>>
>> Now sure I follow - what should be made (void *)? Oh, you mean not
>> passing the preprocessed array as a scan key at all, and instead passing
>> it as a new (void*) parameter to the (new) consistent function?
> 
> Right.
> 
>>> It would be unpleasant to force all BRIN opclasses to immediately
>>> implement the searcharray-logic. If we don't want to do that, we need to
>>> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
>>> be pretty easy, right? Just call the regular consistent function for
>>> every element in the array.
>>
>> True, although the question is how many out-of-core opclasses are there.
>> My impression is the number is pretty close to 0, in which case we're
>> making ourselves to jump through all kinds of hoops, making the code
>> more complex, with almost no benefit in the end.
> 
> Perhaps. How many of the opclasses can do something smart with
> SEARCHARRAY? If the answer is "all" or "almost all", then it seems
> reasonable to just require them all to handle it. If the answer is
> "some", then it would still be nice to provide a naive default
> implementation in the AM itself. Otherwise all the opclasses need to
> include a bunch of boilerplate to support SEARCHARRAY.
> 

For the built-in, I think all can do something smart.

For external, hard to say - my guess is they could do something
interesting too.


>>> If an opclass wants to provide a faster/better implementation, it can
>>> provide a new consistent support procedure that supports that. Let's
>>> assign a new amproc number for new-style consistent function, which must
>>> support SK_SEARCHARRAY, and pass it some scratch space where it can
>>> cache whatever per-scankey data. Because it gets a new amproc number, we
>>> can define the arguments as we wish. We can pass a pointer to the
>>> per-scankey scratch space as a new argument, for example.
>>>
>>> We did this backwards-compatibility dance with the 3/4-argument variants
>>> of the current consistent functions. But I think assigning a whole new
>>> procedure number is better than looking at the number of arguments.
>>
>> I actually somewhat hate the 3/4-argument dance, and I'm opposed to
>> doing that sort of thing again. First, I'm not quite convinced it's
>> worth the effort to jump through this hoop (I recall this being one of
>> the headaches when fixing the BRIN NULL handling). Second, it can only
>> be done once - imagine we now need to add a new optional parameter.
>> Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
>> variants. At which point 4 is ambiguous.
> 
> My point is that we should assign a new amproc number to distinguish the
> new variant, instead of looking at the number of arguments. That way
> it's not ambiguous, and you can define whatever arguments you want for
> the new variant.
> 

Right, I agree.

> Yet another idea is to introduce a new amproc for a consistent function
> that *only* handles the SEARCHARRAY case, and keep the old consistent
> function as it is for the scalars. So every opclass would need to
> implement the current consistent function, just like today. But if an
> opclass wants to support SEARCHARRAY, it could optionally also provide
> an "consistent_array" function.
> 

Hmm, we probably need to do something like this anyway. Because even
with the scratch space in BrinDesc, we still need to track whether the
opclass can process arrays or not. And if it can't we need to emulate it
in brin.c.


>> Yes, my previous message was mostly about backwards compatibility, and
>> this may seem a bit like an argument against it. But that message was
>> more a question "If we do this, is it actually backwards compatible the
>> way we want/need?")
>>
>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
>> doing it that way and report back in a couple days.
> 
> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
> used the preprocess function to pre-calculate the scankey's hash, even
> for scalars. You could use the scratch space in BrinDesc for that,
> before doing anything with SEARCHARRAYs.
> 

Yeah, that's a good idea.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Zheng Li
Дата:
Сообщение: Re: Support logical replication of DDLs
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Cleaning up threading code