Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)
От | Heikki Linnakangas |
---|---|
Тема | Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) |
Дата | |
Msg-id | 2484032b-3e6d-ae6a-61e5-732e2136ad18@iki.fi обсуждение исходный текст |
Ответ на | Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)
(Tomas Vondra <tomas.vondra@enterprisedb.com>)
|
Список | pgsql-hackers |
On 02/07/2023 19:09, Tomas Vondra wrote: > Here's an updated version of the patch series. > > I've polished and pushed the first three patches with cleanup, tests to > improve test coverage and so on. I chose not to backpatch those - I > planned to do that to make future backpatches simpler, but the changes > ended up less disruptive than expected. > > The remaining patches are just about adding SK_SEARCHARRAY to BRIN. > > 0001 - adds the optional preprocess procedure, calls it from brinrescan > > 0002 to 0005 - adds the support to the existing BRIN opclasses Could you implement this completely in the consistent-function, by caching the sorted array in fn_extra, without adding the new preprocess procedure? On first call, when fn_extra == NULL, sort the array and stash it in fn_extra. I don't think that works, because fn_extra isn't reset when the scan keys change on rescan. We could reset it, and document that you can use fn_extra for per-scankey caching. There's some precedence for not resetting it though, see commit d22a09dc70f. But we could provide an opaque per-scankey scratch space like that somewhere else. In BrinDesc, perhaps. The new preprocess support function feels a bit too inflexible to me. True, you can store whatever you want in the ScanKey that it returns, but since that's the case, why not just make it void * ?It seems that the constraint here was that you need to pass a ScanKey to the consistent function, because the consistent function's signature is what it is. But we can change the signature, if we introduce a new support amproc number for it. > The main open question I have is what exactly does it mean that the > procedure is optional. In particular, should it be supported to have a > BRIN opclass without the "preprocess" procedure but using the other > built-in support procedures? > > For example, imagine you have a custom BRIN opclass in an extension (for > a custom data type or something). This does not need to implement any > procedures, it can just call the existing built-in ones. Of course, this > won't get the "preprocess" procedure automatically. > > Should we support such opclasses or should we force the extension to be > updated by adding a preprocess procedure? I'd say "optional" means we > should support (otherwise it'd not really optional). > > The reason why this matters is that "amsearcharray" is AM-level flag, > but the support procedure is defined by the opclass. So the consistent > function needs to handle SK_SEARCHARRAY keys both with and without > preprocessing. > > That's mostly what I did for all existing BRIN opclasses (it's a bit > confusing that opclass may refer to both the "generic" minmax or the > opclass defined for a particular data type). All the opclasses now > handle three cases: > > 1) scalar keys (just like before, with amsearcharray=fase) > > 2) array keys with preprocessing (sorted array, array of hashes, ...) > > 3) array keys without preprocessing (for compatibility with old > opclasses missing the optional preprocess procedure) > > The current code is a bit ugly, because it duplicates a bunch of code, > because the option (3) mostly does (1) in a loop. I'm confident this can > be reduced by refactoring and reusing some of the "shared" code. > > The question is if my interpretation of what "optional" procedure means > is reasonable. Thoughts? > > The other thing is how to test this "compatibility" code. I assume we > want to have the procedure for all built-in opclasses, so that won't > exercise it. I did test it by temporarily removing the procedure from a > couple pg_amproc.dat entries. I guess creating a custom opclass in the > regression test is the right solution. It would be unpleasant to force all BRIN opclasses to immediately implement the searcharray-logic. If we don't want to do that, we need to implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would be pretty easy, right? Just call the regular consistent function for every element in the array. If an opclass wants to provide a faster/better implementation, it can provide a new consistent support procedure that supports that. Let's assign a new amproc number for new-style consistent function, which must support SK_SEARCHARRAY, and pass it some scratch space where it can cache whatever per-scankey data. Because it gets a new amproc number, we can define the arguments as we wish. We can pass a pointer to the per-scankey scratch space as a new argument, for example. We did this backwards-compatibility dance with the 3/4-argument variants of the current consistent functions. But I think assigning a whole new procedure number is better than looking at the number of arguments. -- Heikki Linnakangas Neon (https://neon.tech)
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Nathan BossartДата:
Сообщение: Re: Reducing connection overhead in pg_upgrade compat check phase
Следующее
От: Nathan BossartДата:
Сообщение: Re: Preventing non-superusers from altering session authorization