Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
От | Andres Freund |
---|---|
Тема | Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation |
Дата | |
Msg-id | 20230119021053.xn7c5aczln5scen3@awork3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-hackers |
Hi, On 2023-01-18 17:00:48 -0800, Peter Geoghegan wrote: > On Wed, Jan 18, 2023 at 4:37 PM Andres Freund <andres@anarazel.de> wrote: > > I can, it should be just about trivial code-wise. A bit queasy about trying to > > forsee the potential consequences. > > That's always going to be true, though. > > > A somewhat related issue is that pgstat_report_vacuum() sets dead_tuples to > > what VACUUM itself observed, ignoring any concurrently reported dead > > tuples. As far as I can tell, when vacuum takes a long time, that can lead to > > severely under-accounting dead tuples. > > Did I not mention that one? There are so many that it can be hard to > keep track! That's why I catalog them. I don't recall you doing, but there's lot of emails and holes in my head. > This creates an awkward but logical question, though: what if > dead_tuples doesn't go down at all? What if VACUUM actually has to > increase it, because VACUUM runs so slowly relative to the workload? Sure, that can happen - but it's not made better by having wrong stats :) > > I do think this is an argument for splitting up dead_tuples into separate > > "components" that we track differently. I.e. tracking the number of dead > > items, not-yet-removable rows, and the number of dead tuples reported from DML > > statements via pgstats. > > Is it? Why? We have reasonably sophisticated accounting in pgstats what newly live/dead rows a transaction "creates". So an obvious (and wrong) idea is just decrement reltuples by the number of tuples removed by autovacuum. But we can't do that, because inserted/deleted tuples reported by backends can be removed by on-access pruning and vacuumlazy doesn't know about all changes made by its call to heap_page_prune(). But I think that if we add a pgstat_count_heap_prune(nredirected, ndead, nunused) around heap_page_prune() and a pgstat_count_heap_vacuum(nunused) in lazy_vacuum_heap_page(), we'd likely end up with a better approximation than what vac_estimate_reltuples() does, in the "partially scanned" case. > I'm all in favor of doing that, of course. I just don't particularly > think that it's related to this other problem. One problem is that we > count dead tuples incorrectly because we don't account for the fact > that things change while VACUUM runs. The other problem is that the > thing that is counted isn't broken down into distinct subcategories of > things -- things are bunched together that shouldn't be. If we only adjust the counters incrementally, as we go, we'd not update them at the end of vacuum. I think it'd be a lot easier to only update the counters incrementally if we split ->dead_tuples into sub-counters. So I don't think it's entirely unrelated. You probably could get close without splitting the counters, by just pushing down the counting, and only counting redirected and unused during heap pruning. But I think it's likely to be more accurate with the split counter. > Oh wait, you were thinking of what I said before -- my "awkward but > logical question". Is that it? I'm not quite following? The "awkward but logical" bit is in the email I'm just replying to, right? Greetings, Andres Freund
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Peter GeogheganДата:
Сообщение: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Следующее
От: Peter GeogheganДата:
Сообщение: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation