Re: Emit fewer vacuum records by reaping removable tuples during pruning
От | Robert Haas |
---|---|
Тема | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Дата | |
Msg-id | CA+TgmoZSST7Pf=o_Y6o-7WdUBQdZLj4YbaQu9qs5j_AgLqmRaw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Emit fewer vacuum records by reaping removable tuples during pruning (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: Emit fewer vacuum records by reaping removable tuples during pruning
(Robert Haas <robertmhaas@gmail.com>)
Re: Emit fewer vacuum records by reaping removable tuples during pruning (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-hackers |
On Thu, Jan 18, 2024 at 10:09 AM Peter Geoghegan <pg@bowt.ie> wrote: > The problem with your justification for moving things in that > direction (if any) is that it is occasionally not quite true: there > are at least some cases where line pointer truncation after making a > page's LP_DEAD items -> LP_UNUSED will actually matter. Plus > PageGetHeapFreeSpace() will return 0 if and when > "PageGetMaxOffsetNumber(page) > MaxHeapTuplesPerPage && > !PageHasFreeLinePointers(page)". Of course, nothing stops you from > compensating for this by anticipating what will happen later on, and > assuming that the page already has that much free space. I think we're agreeing but I want to be sure. If we only set LP_DEAD items to LP_UNUSED, that frees no space. But if doing so allows us to truncate the line pointer array, that that frees a little bit of space. Right? One problem with using this as a justification for the status quo is that truncating the line pointer array is a relatively recent behavior. It's certainly much newer than the choice to have VACUUM touch the FSM in the second page than the first page. Another problem is that the amount of space that we're freeing up in the second pass is really quite minimal even when it's >0. Any tuple that actually contains any data at all is at least 32 bytes, and most of them are quite a bit larger. Item pointers are 2 bytes. To save enough space to fit even one additional tuple, we'd have to free *at least* 16 line pointers. That's going to be really rare. And even if it happens, is it even useful to advertise that free space? Do we want to cram one more tuple into a page that has a history of extremely heavy updates? Could it be that it's smarter to just forget about that free space? You've written before about the stupidity of cramming tuples of different generations into the same page, and that concept seems to apply here. When we heap_page_prune(), we don't know how much time has elapsed since the page was last modified - but if we're lucky, it might not be very much. Updating the FSM at that time gives us some shot of filling up the page with data created around the same time as the existing page contents. By the time we vacuum the indexes and come back, that temporal locality is definitely lost. > You'd likely prefer a simpler argument for doing this -- an argument > that doesn't require abandoning/discrediting the idea that a high > degree of FSM_CATEGORIES-wise precision is a valuable thing. Not sure > that that's possible -- the current design is at least correct on its > own terms. And what you propose to do will probably be less correct on > those same terms, silly though they are. I've never really understood why you think that the number of FSM_CATEGORIES is the problem. I believe I recall you endorsing a system where pages are open or closed, to try to achieve temporal locality of data. I agree that such a system could work better than what we have now. I think there's a risk that such a system could create pathological cases where the behavior is much worse than what we have today, and I think we'd need to consider carefully what such cases might exist and what mitigation strategies might make sense. However, I don't see a reason why such a system should intrinsically want to reduce FSM_CATEGORIES. If we have two open pages and one of them has enough space for the tuple we're now trying to insert and the other doesn't, we'd still like to avoid having the FSM hand us the one that doesn't. Now, that said, I suspect that we actually could reduce FSM_CATEGORIES somewhat without causing any real problems, because many tables are going to have tuples that are all about the same size, and even in a table where the sizes vary more than is typical, a single tuple can't consume more than a quarter of the page, so granularity above that point seems completely useless. So if we needed some bitspace to track the open/closed status of pages or similar, I suspect we could find that in the existing FSM byte per page without losing anything. But all of that is just an argument that reducing the number of FSM_CATEGORIES is *acceptable*; it doesn't amount to an argument that it's better. My current belief is that it isn't better, just a vehicle to do something else that maybe is better, like squeezing open/closed tracking or similar into the existing bit space. My understanding is that you think it would be better on its own terms, but I have not yet been able to grasp why that would be so. -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: