Re: Eager page freeze criteria clarification
От | Melanie Plageman |
---|---|
Тема | Re: Eager page freeze criteria clarification |
Дата | |
Msg-id | CAAKRu_YzowY80dsktvykUCEJBE0Mco7SuBnvGED2_XyuC_3P=g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Eager page freeze criteria clarification (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Eager page freeze criteria clarification
Re: Eager page freeze criteria clarification |
Список | pgsql-hackers |
On Mon, Aug 28, 2023 at 12:26 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Aug 28, 2023 at 10:00 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > Then there's the question of whether it's the right metric. My first > reaction is to think that it sounds pretty good. One thing I really > like about it is that if the table is being vacuumed frequently, then > we freeze less aggressively, and if the table is being vacuumed > infrequently, then we freeze more aggressively. That seems like a very > desirable property. It also seems broadly good that this metric > doesn't really care about reads. If there are a lot of reads on the > system, or no reads at all, it doesn't really change the chances that > a certain page is going to be written again soon, and since reads > don't change the insert LSN, here again it seems to do the right > thing. I'm a little less clear about whether it's good that it doesn't > really depend on wall-clock time. Certainly, that's desirable from the > point of view of not wanting to have to measure wall-clock time in > places where we otherwise wouldn't have to, which tends to end up > being expensive. However, if I were making all of my freezing > decisions manually, I might be more freeze-positive on a low-velocity > system where writes are more stretched out across time than on a > high-velocity system where we're blasting through the LSN space at a > higher rate. But maybe that's not a very important consideration, and > I don't know what we'd do about it anyway. By low-velocity, do you mean lower overall TPS? In that case, wouldn't you be less likely to run into xid wraparound and thus need less aggressive opportunistic freezing? > > Page Freezes/Page Frozen (less is better) > > > > | | Master | (1) | (2) | (3) | (4) | (5) | > > |---+--------+---------+---------+---------+---------+---------| > > | A | 28.50 | 3.89 | 1.08 | 1.15 | 1.10 | 1.10 | > > | B | 1.00 | 1.06 | 1.65 | 1.03 | 1.59 | 1.00 | > > | C | N/A | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | > > | D | 2.00 | 5199.15 | 5276.85 | 4830.45 | 5234.55 | 2193.55 | > > | E | 7.90 | 3.21 | 2.73 | 2.70 | 2.69 | 2.43 | > > | F | N/A | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | > > | G | N/A | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | > > | H | N/A | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | > > | I | N/A | 42.00 | 42.00 | N/A | 41.00 | N/A | > > Hmm. I would say that the interesting rows here are A, D, and I, with > rows C and E deserving honorable mention. In row A, master is bad. So, this is where the caveat about absolute number of page freezes matters. In algorithm A, master only did 57 page freezes (spread across the various pgbench tables). At the end of the run, 2 pages were still frozen. > In row D, your algorithms are all bad, really bad. I don't quite > understand how it can be that bad, actually. So, I realize now that this test was poorly designed. I meant it to be a worst case scenario, but I think one critical part was wrong. In this example one client is going at full speed inserting a row and then updating it. Then another rate-limited client is deleting old data periodically to keep the table at a constant size. I meant to bulk load the table with enough data that the delete job would have data to delete from the start. With the default autovacuum settings, over the course of 45 minutes, I usually saw around 40 autovacuums of the table. Due to the rate limiting, the first autovacuum of the table ends up freezing many pages that are deleted soon after. Thus the total number of page freezes is very high. I will redo benchmarking of workload D and start the table with the number of rows which the DELETE job seeks to maintain. My back of the envelope math says that this will mean ratios closer to a dozen (and not 5000). Also, I had doubled checkpoint timeout, which likely led master to freeze so few pages (2 total freezes, neither of which were still frozen at the end of the run). This is an example where master's overall low number of page freezes makes it difficult to compare to the alternatives using a ratio. I didn't initially question the numbers because it seems like freezing data and then deleting it right after would naturally be one of the worst cases for opportunistic freezing, but certainly not this bad. > Row I looks bad for algorithms 1, 2, and 4: they freeze pages because > it looks cheap, but the work doesn't really pay off. Yes, the work queue example looks like it is hard to handle. > > % Frozen at end of run > > > > | | Master | (1) | (2) | (3) | (4) | (5) | > > |---+--------+-----+-----+-----+------+-----+ > > | A | 0 | 1 | 99 | 0 | 81 | 0 | > > | B | 71 | 96 | 99 | 3 | 98 | 2 | > > | C | 0 | 9 | 100 | 6 | 92 | 5 | > > | D | 0 | 1 | 1 | 1 | 1 | 1 | > > | E | 0 | 63 | 100 | 68 | 100 | 67 | > > | F | 0 | 5 | 14 | 6 | 14 | 5 | > > | G | 0 | 100 | 100 | 92 | 100 | 67 | > > | H | 0 | 11 | 100 | 9 | 86 | 5 | > > | I | 0 | 100 | 100 | 0 | 100 | 0 | > > So all of the algorithms here, but especially 1, 2, and 4, freeze a > lot more often than master. > > If I understand correctly, we'd like to see small numbers for B, D, > and I, and large numbers for the other workloads. None of the > algorithms seem to achieve that. (3) and (5) seem like they always > behave as well or better than master, but they produce small numbers > for A, C, F, and H. (1), (2), and (4) regress B and I relative to > master but do better than (3) and (5) on A, C, and the latter two also > on E. > > B is such an important benchmarking workload that I'd be loathe to > regress it, so if I had to pick on the basis of this data, my vote > would be (3) or (5), provided whatever is happening with (D) in the > previous metric is not as bad as it looks. What's your reason for > preferring (4) and (5) over (2) and (3)? I'm not clear that these > numbers give us much of an idea whether 10% or 33% or something else > is better in general. (1) seems bad to me because it doesn't consider whether or not freezing will be useful -- only if it will be cheap. It froze very little of the cold data in a workload where a small percentage of it was being modified (especially workloads A, C, H). And it froze a lot of data in workloads where it was being uniformly modified (workload B). I suggested (4) and (5) because I think the "older than 33%" threshold is better than the "older than 10%" threshold. I chose both because I am still unclear on our values. Are we willing to freeze more aggressively at the expense of emitting more FPIs? As long as it doesn't affect throughput? For pretty much all of these workloads, the algorithms which froze based on page modification recency OR FPI required emitted many more FPIs than those which froze based only on page modification recency. I've attached the WIP patch that I forgot in my previous email. I'll rerun workload D in a more reasonable way and be back with results. - Melanie
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Heikki LinnakangasДата:
Сообщение: Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)