Re: Parallel CREATE INDEX for GIN indexes
| От | Tomas Vondra |
|---|---|
| Тема | Re: Parallel CREATE INDEX for GIN indexes |
| Дата | |
| Msg-id | 0dc810dd-6cf1-4789-abd1-46f6332fe778@vondra.me обсуждение исходный текст |
| Ответ на | Re: Parallel CREATE INDEX for GIN indexes (Kirill Reshke <reshkekirill@gmail.com>) |
| Ответы |
Re: Parallel CREATE INDEX for GIN indexes
|
| Список | pgsql-hackers |
On 1/16/26 19:43, Kirill Reshke wrote: > On Fri, 16 Jan 2026 at 23:03, Tomas Vondra <tomas@vondra.me> wrote: >> >> >> On 1/15/26 06:08, Kirill Reshke wrote: >>> ... >>>>> >>>>> I think tuplesort_begin_index_gin() has the same issue. It does this to >>>>> look up the comparison function: >>>>> >>>>> /* >>>>> * Look for an ordering for the index key data type, and then the sort >>>>> * support function. >>>>> */ >>>>> typentry = lookup_type_cache(att->atttypid, TYPECACHE_LT_OPR); >>>>> PrepareSortSupportFromOrderingOp(typentry->lt_opr, sortKey); >>>>> >>>>> That also looks up the key type's b-tree operator rather than the GIN >>>>> opclass's compare function. >>>>> >>>> >>>> Thanks for noticing this, I'll get this fixed next week. >>>> >>>> Funny, you noticed that almost exactly one year after I fixed the other >>>> incorrect place in the patch ;-) >>>> >> >> The attached 0001 should fix this. I'm wondering what kinds of issues it >> might cause, and if we need to mention that in release notes. AFAICS it >> would cause trouble if (a) there's no b-tree opclass, in which case the >> tuplesort_begin_index_gin() errors out, or (b) the btree/gin opclasses >> disagree on ordering (or rather equality). >> >> AFAIK we haven't heard anything about index builds failing on 18, and >> with both btree/gin opclasses it seems unlikely they'd define equality >> differently. Maybe I'm missing something. >> >>> >>> >>> I was looking at code coverage for GIN indexes [1] and noticed that >>> Parallel GIN build is not covered in the regression test. Btree >>> parallel build (_bt_begin_parallel function for example at[0]) is >>> covered. I did not find any btree-parallel-build dedicated test, looks >>> like this is covered by accident, from write_paralle, partition_prune >>> and other regression tests. >>> >>> So, maybe add some tests here? Also, I wonder what regression sql file >>> to use, or maybe create a new one. >>> >> >> Fair point. Someone pinged me about this coverage issue a while back, >> but it completely slipped my mind. 0002 tweaks two places in regression >> tests to build the GIN index in parallel. Which for me seems to improve >> the coverage quite a bit. It's not perfect, because it's still not a lot >> of data, which means some code is impossible to hit (e.g. it'll not hit >> trimming). >> >> regards >> >> -- >> Tomas Vondra > > > Hi! Thank you for posting this. > > I have looked at 0001. Added code seems to be 1-1 matching for what > GinBufferInit is doing, but the commit message refers to initGinState. > Maybe I did not get it right though... Well, GinBufferInit got it from initGinState too. I haven't checked which of those places is "closer", I simply used the original one. > > Also, the fact that tuplesort_begin_index_gin is placed not inside > src/backend/access/gin ... Is it a little awkward? I am not saying > this is anything worth fixing, but functions like writetup_index_gin > etc, not being inside `access/gin`, is it a layering violation? > There's also tuplesort_begin_index_btree, tuplesort_begin_index_hash, tuplesort_begin_index_gist and tuplesort_begin_index_brin. I don't think this is a layering violation, it's simply "create a tuplesort tailored for a specific index AM". Or from the other direction - we could move this to individual index AMs, but then the index AMs would need to know about tuplesort. Which seems strange. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: