Process local hint bit cache
От | Merlin Moncure |
---|---|
Тема | Process local hint bit cache |
Дата | |
Msg-id | AANLkTi=nJ_QyE7Ape5Ja+o3f=jNRXmNeOuWjAOFdWre2@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Process local hint bit cache
Re: Process local hint bit cache |
Список | pgsql-hackers |
In a previous thread (http://postgresql.1045698.n5.nabble.com/Set-hint-bits-upon-eviction-from-BufMgr-td4264323.html) I was playing with the idea of granting the bgwriter the ability to due last chance hint bit setting before evicting the page out. I still think might be a good idea, and it might be an especially good idea, especially in scenarios where you get set the PD_ALL_VISIBLE bit when writing out the page, a huge win in some cases. That said, it bugged me that it didn't really fix the general case of large data insertions and the subsequent i/o problems setting out the bits, which is my primary objective in the short term. So I went back to the drawing board, reviewed the archives, and came up with a new proposal. I'd like to see a process local clog page cache of around 1-4 pages (8-32kb typically) that would replace the current TransactionLogFetch last xid cache. It should be small, because I doubt more would really help much (see below) and we want to keep this data in the tight cpu caches since it's going to be constantly scanned. The cache itself will be the clog pages and a small header per page which will contain the minimum information necessary to match an xid against a page to determine a hit, and a count of hits. Additionally we keep a separate small array (say 100) of type xid (int) that we insert write into in a cache miss. So, cache lookup algorithm basically is: scan clog cache headers and check for hit if found (xid in range covered in clog page), header.hit_count++; else miss_array[miss_count++] = xid; A cache hit is defined about getting useful information from the page, that is a transaction being committed or invalid. When the miss count array fills, we sort it and determine the most commonly hit clog page that is not in the cache and use that information to replace pages in the cache if necessary, then reset the counts. Maybe we can add a minimum threshold of hits, say 5-10% if miss_array size for a page to be deemed interesting enough to be loaded into the cache. Interaction w/set hint bits: *) If a clog lookup faults through the cache, we basically keep the current behavior. That is, the hint bits are set and the page is marked BM_DIRTY and the hint bits get written back *) If we get a clog cache hit, that is the hint bits are not set but we pulled the transaction status from the cache, the hint bits are recorded on the page *but the page is not written back*, at least on hint bit basis alone. This behavior branch is more or less the BM_UNTIDY as suggested by haas (see archives), except it's only seen in 'cache hit' scenarios. We are not writing pages back because the cache is suggesting there is little/not benefit to write them back. Thus, if a single backend is scanning a lot of pages with transactions touching a very small number of clog pages, hint bits are generally not written back because they are not needed and in fact not helping. However, if the xid are spread around a large number of clog pages, we get the current behavior more or less (plus the overhead of cache maintenance). With the current code base, hint bits are very beneficial when the xid entropy is high and the number of repeated scan is high, and not so good when the xid entropy is low and the number of repeated scans is low. The process local cache attempts to redress this without disadvantaging the already good cases. Furthermore, if it can be proven that the cache overhead is epsilon, it's pretty unlikely to negatively impact anyone negatively, at lest, that's my hope. Traffic to clog will reduce (although not much, since i'd wager the current 'last xid' cache works pretty well), but i/o should be reduced, in some cases quite significantly for a tiny cpu cost (although that remains to be proven). merlin
В списке pgsql-hackers по дате отправления:
Следующее
От: Noah MischДата:
Сообщение: pg_dump --binary-upgrade vs. ALTER TYPE ... DROP ATTRIBUTE