Re: AIO v2.5
| От | Andres Freund | 
|---|---|
| Тема | Re: AIO v2.5 | 
| Дата | |
| Msg-id | s5bbhmfez5pe46yfvqkxkg72zsc2r3knkvapqyz37algbfw2sf@q7jcviovnalu обсуждение исходный текст | 
| Ответ на | Re: AIO v2.5 (Noah Misch <noah@leadboat.com>) | 
| Ответы | Re: AIO v2.5 | 
| Список | pgsql-hackers | 
Hi, Sorry for the slow work on this. The cycle times are humonguous due to valgrind being so slow... On 2025-04-03 12:40:23 -0700, Noah Misch wrote: > On Thu, Apr 03, 2025 at 02:19:43PM -0400, Andres Freund wrote: > > The best fix for that one would, I think, be to have method_io_uring() iterate > > over the IOV and mark the relevant regions as defined? That does fix the > > issue at least and does seem to make sense? > > Makes sense. Valgrind knows that read() makes its target bytes "defined". It > probably doesn't have an io_uring equivalent for that. Correct - and I think it would be nontrivial to add, because there's not easy syscall to intercept... > I expect we only need this for local buffers, and it's unclear to me how the > fix for (4a) didn't fix this. At that time I didn't apply the fix in 4a) to local buffers, because local buffers, in HEAD, don't have the valgrind integration. Without that marking the buffer as NOACCESS would cause all sorts of issues, because it'd be considered inaccessible even after pinning. As you analyzed, that then ends up considered undefined due to the MemoryContextAlloc(). > In the general case, we could want client requests as follows: > > - If completor==definer and has not dropped pin: > - Make defined before verifying page. That's all. It might be cleaner to > do this when first retrieving a return value from io_uring, since this > just makes up for what Valgrind already does for readv(). Yea, I think it's better to do that in io_uring. It's what I have done in the attached. > - If completor!=definer or has dropped pin: > - Make NOACCESS in definer when definer cedes its own pin. That's the current behaviour for shared buffers, right? > - For io_method=worker, make UNDEFINED before starting readv(). It might be > cleanest to do this when the worker first acts as the owner of the AIO > subsystem pin, if that's a clear moment earlier than readv(). Hm, what do we need this for? > - Make DEFINED in completor before verifying page. It might be cleaner to > do this when the completor first retrieves a return value from io_uring, > since this just makes up for what Valgrind already does for readv(). I think we can't rely on the marking during retrieving it from io_uring, as that might have happened in a different backend for a temp buffer. That'd only happen if we got io_uring events for *another* IO that involved a shared rel, but it can happen. > > Not quite sure if we should mark > > the entire IOV is efined or just the portion that was actually read - the > > latter is additional fiddly code, and it's not clear it's likely to be helpful? > > Seems fine to do the simpler way if that saves fiddly code. Can't quite decide, it's just at the border of what I consider too fiddly... See the change to method_io_uring.c in the attached patch. > > Questions: > > > > 1) It'd be cleaner to implement valgrind support in localbuf.c, so we don't > > need to have special-case logic for that. But it also makes the change less > > localized and more "impactful", who knows what kind of skullduggery we have > > been getting away with unnoticed. > > > > I haven't written the code up yet, but I don't think it'd be all that much > > code to add valgrind support to localbuf. > > It would be the right thing long-term, and it's not a big deal if it causes > some false positives initially. So if you're leaning that way, that's good. It was easy enough. I saw one related failure, FlushRelationBuffers() didn't pin temporary buffers before flushing them. Pinning the buffers fixed that. I don't think it's a real problem to not pin the local buffer during FlushRelationBuffers(), at least not today. But it seems unnecessarily odd to not pin it. I wish valgrind had a way to mark the buffer as inaccessible and then accessible again, without loosing the defined-ness information... Greetings, Andres Freund
Вложения
В списке pgsql-hackers по дате отправления: