Re: Single pass vacuum - take 1
От | Robert Haas |
---|---|
Тема | Re: Single pass vacuum - take 1 |
Дата | |
Msg-id | CA+TgmobCvz0XxmM-g_Wg=5VrkKqEB=mZ2G8hA7qvCjRQxFCsNQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Single pass vacuum - take 1 (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Ответы |
Re: Single pass vacuum - take 1
|
Список | pgsql-hackers |
On Thu, Jul 14, 2011 at 12:43 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > How does this interact with the visibility map? If you set the visibility > map bit after vacuuming indexes, a subsequent vacuum will not visit the > page. The second vacuum will update relindxvacxlogid/off, but it will not > clean up the dead line pointers left behind by the first vacuum. Now the LSN > on the page differs from the one stored in pg_class, so subsequent pruning > will not remove the dead line pointers either. Currently, I think we would only set the visibility map bit after vacuuming the page for the second time. The patch as submitted doesn't appear to go back and set visibility map bits after finishing the index vacuum. Now, that might be nice to do, because then a hypothetical index-only scan could start taking advantage of vacuum having been done sooner. If we wanted to do that, we could restructure the visibility map to store two bits per page: one to indicate whether there is any potential work for VACUUM to do (modulo freezing) and the other to indicate whether an index pointer could possibly be aimed at a dead line pointer. (In fact, maybe we'd even want to have a third bit to indicate "all tuples frozen", which would be useful for optimizing anti-wraparound vacuum.) > I think you can sidestep that > if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead > of equality. I don't think that works, because the point of storing the LSN in pg_class is to verify that the vacuum completed the index cleanup without error. The fact that a newer vacuum accomplished that goal does not mean that all older ones did. > Ignoring the issue stated in previous paragraph, I think you wouldn't > actually need an 64-bit LSN. A smaller counter is enough, as wrap-around > doesn't matter. In fact, a single bit would be enough. After a successful > vacuum, the counter on each heap page (with dead line pointers) is N, and > the value in pg_class is N. There are no other values on the heap, because > vacuum will have cleaned them up. When you begin the next vacuum, it will > stamp pages with N+1. So at any stage, there is only one of two values on > any page, so a single bit is enough. (But as I said, that doesn't hold if > vacuum skips some pages thanks to the visibility map) If this can be made to work, it's a very appealing idea. The patch as submitted uses lp_off to store a single bit, to distinguish between vacuum and dead-vacuumed, but we could actually have (for greater safety and debuggability) a 15-byte counter that just wraps around from 32,767 to 1. (Maybe it would be wise to reserve a few counter values, or a few bits, or both, for future projects.) That would eliminate the need to touch PageRepairFragmentation() or use the special space, since all the information would be in the line pointer itself. Not having to rearrange the page to reclaim dead line pointers is appealing, too. > Is there something in place to make sure that pruning uses an up-to-date > relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date, > you'll just miss the opportunity to remove some dead tuples. This seems like a tricky problem, because it could cause us to repeatedly fail to remove the same dead line pointers, which would be poor. We could do something like this: after updating pg_class, vacuum send an interrupt to any backend which holds RowExclusiveLock or higher on that relation. The interrupt handler just sets a flag. If that backend does heap_page_prune() and sees the flag set, it knows that it needs to recheck pg_class. This is a bit grotty and doesn't completely close the race condition (the signal might not arrive in time), but it ought to make it narrow enough not to matter in practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: