Re: Parallel CREATE INDEX for GIN indexes
От | Tomas Vondra |
---|---|
Тема | Re: Parallel CREATE INDEX for GIN indexes |
Дата | |
Msg-id | 203656b9-fbb1-4643-821a-9cce1b7edd81@vondra.me обсуждение исходный текст |
Ответ на | Re: Parallel CREATE INDEX for GIN indexes (Vinod Sridharan <vsridh90@gmail.com>) |
Ответы |
Re: Parallel CREATE INDEX for GIN indexes
|
Список | pgsql-hackers |
On 4/18/25 03:03, Vinod Sridharan wrote: > Hello, > As part of testing this change I believe I found a scenario where the > parallel build seems to trigger OOMs for larger indexes. Specifically, > the calls for ginEntryInsert seem to leak memory into > TopTransactionContext and OOM/crash the outer process. > For serial build, the calls for ginEntryInsert tend to happen in a > temporary memory context that gets reset at the end of the > ginBuildCallback. > For inserts, the call has a custom memory context and gets reset at > the end of the insert. > For parallel build, during the merge phase, the MemoryContext isn't > swapped - and so this happens on the TopTransactionContext, and ends > up growing (especially for larger indexes). > Yes, that's true. The ginBuildCallbackParallel() already releases memory after flushing the in-memory state, but I missed _gin_parallel_merge() needs to be careful about memory usage too. I haven't been able to trigger OOM (or even particularly bad) memory usage, but I suppose it might be an issue with custom GIN opclasses with much wider keys. > I believe at the very least these should happen inside the tmpCtx > found in the GinBuildState and reset periodically. > > In the attached patch, I've tried to do this, and I'm able to build > the index without OOMing, and only consuming maintenance_work_mem > through the merge process. > > Would appreciate your thoughts on this (and whether there's other approaches to > resolve this too). > The patch seems fine to me - I repeated the tests with mailing list archives, with MemoryContextStats() in _gin_parallel_merge, and it reliably minimizes the memory usage. So that's fine. I was also worried if this might have performance impact, but it actually seems to make it a little bit faster. I'll get this pushed. thanks -- Tomas Vondra
В списке pgsql-hackers по дате отправления: