Re: patch: improve SLRU replacement algorithm
От | Robert Haas |
---|---|
Тема | Re: patch: improve SLRU replacement algorithm |
Дата | |
Msg-id | CA+Tgmob_PRNjERHExhaWDJaF9HSP5cq+WKC=0_SS9DBOBU_XRg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: patch: improve SLRU replacement algorithm (Greg Stark <stark@mit.edu>) |
Ответы |
Re: patch: improve SLRU replacement algorithm
(Jeff Janes <jeff.janes@gmail.com>)
Re: patch: improve SLRU replacement algorithm (Greg Stark <stark@mit.edu>) |
Список | pgsql-hackers |
On Thu, Apr 5, 2012 at 9:29 AM, Greg Stark <stark@mit.edu> wrote: > On Thu, Apr 5, 2012 at 2:24 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Sorry, I don't understand specifically what you're looking for. I >> provided latency percentiles in the last email; what else do you want? > > I think he wants how many waits were there that were between 0 and 1s > how many between 1s and 2s, etc. Mathematically it's equivalent but I > also have trouble visualizing just how much improvement is represented > by 90th percentile dropping from 1688 to 1620 (ms?) Yes, milliseconds. Sorry for leaving out that detail. I've run these scripts so many times that my eyes are crossing. Here are the latencies, bucketized by seconds, first for master and then for the patch, on the same test run as before: 0 26179411 1 3642 2 660 3 374 4 166 5 356 6 41 7 8 8 56 9 0 10 0 11 21 12 11 0 26199130 1 4840 2 267 3 290 4 40 5 77 6 36 7 3 8 2 9 33 10 37 11 2 12 1 13 4 14 5 15 3 16 0 17 1 18 1 19 1 I'm not sure I find those numbers all that helpful, but there they are. There are a couple of outliers beyond 12 s on the patched run, but I wouldn't read anything into that; the absolute worst values bounce around a lot from test to test. However, note that every bucket between 2s and 8s improves, sometimes dramatically. It's worth keeping in mind here that the system is under extreme I/O strain on this test, and the kernel responds by forcing user processes to sleep when they try to do I/O. So the long stalls that this patch eliminates are bound are actually allowing the I/O queues to drain out a little, and without that rest time, you're bound to see more I/O stalls elsewhere. It's also worth keeping in mind that this is an extremely write-intensive benchmark, and that Linux 3.2 changed behavior in this area quite a bit. It's not impossible that on an older kernel, the type of I/O thrashing that happens when too many backends are blocked by dirty_ratio or dirty_bytes might actually make this change a regression compared to letting those backends block on a semaphore, which may be a reason to NOT back-patch this change. I think that the patch is fixing a fairly obvious defect in our algorithm, but that doesn't mean that there's no set of circumstances under which it could backfire, especially when deployed onto three or four year old kernels. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: