Re: Design notes for BufMgrLock rewrite
От | Jim C. Nasby |
---|---|
Тема | Re: Design notes for BufMgrLock rewrite |
Дата | |
Msg-id | 20050217053925.GV52357@decibel.org обсуждение исходный текст |
Ответ на | Re: Design notes for BufMgrLock rewrite (Kenneth Marshall <ktm@it.is.rice.edu>) |
Список | pgsql-hackers |
On Wed, Feb 16, 2005 at 11:42:11AM -0600, Kenneth Marshall wrote: > I have seen this algorithm described as a more generalized clock type > algorithm. As the size of the counter increases, up to the number of > buffers, the clock algorithm becomes LRU. One bit is the lightest > weight approximation. Increasing the number of bits or a count makes > the clock algorithm more closely approximate LRU. You need to balance > how long it takes to find a free buffer. That time increases as the > count size increases. Yeah, the trick to this seems to be how you tweak the rate at which stuff 'falls out' of the active list. I can think of 3 ways to accomplish this: 1) Change the maximum value to count to (or the value at which a buffer is considered no longer in use). This has the disadvantage of changing how effective the count is. In an extreme situation, it would retuce back to a single bit. It also won't affect buffers that already have higher counts, meaning older data is more likely to stay in buffer than newer data. 2) Change the amount the counter is incremented by on each use (and/or the amount it's decremented by). An example of this might be having the clock decrement by 10. Under a light to medium load, the system might increment by 10 on each use, but if the system starts getting pressed for free buffers, that could be reduced. A downside of this would be that it potentially requires more space in each header to store a larger value. An advatage is that it allows more flexability than #1. For example, if the decrement value is increased in order to speed up reclaiming of buffers, it won't create a difference in how buffers are weighted based on when they were accessed like #1 will. 3) Change the speed of the clock. This is what BSD effectively does. The OS periodically checks to see how many pages are available on the free list, as well as how many were removed since the last check. This information is used to decide how many pages the clock algorithm should attempt to free in the next time period (which can be 0). If a two-hand algorithm is used, the distance between the two hands can also be varied. I think #3 probably means you'd need a seperate process to handle the clock and moving buffers to a free list. Or perhaps this gets tied in with the background writer. This might mean more overhead, but it could improve contention if it means only one process needs to aquire some of these locks. So much for a simple design discussion. :) Fortunately, #1 and #2 should be easy to test. #3 will certainly require more code, but it would probably be simpler to implement than having multiple backends running the clock algorithm (which I think is the current plan). Something else I thought of; by using a counter instead of a bit, you can also 'pre-seed' buffers based on why they were populated. For example, pages brought in from an index might start with a value of 4; heap pages 3, heap pages from a seqscan 2, and pages from vacuum, 1, or maybe even 0. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
В списке pgsql-hackers по дате отправления: