Re: Should we update the random_page_cost default value?
От | Tomas Vondra |
---|---|
Тема | Re: Should we update the random_page_cost default value? |
Дата | |
Msg-id | 8346d349-92b7-4725-8c07-8eb68b9d58c8@vondra.me обсуждение исходный текст |
Ответ на | Re: Should we update the random_page_cost default value? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Should we update the random_page_cost default value?
|
Список | pgsql-hackers |
On 10/8/25 19:23, Robert Haas wrote: > On Wed, Oct 8, 2025 at 12:24 PM Tomas Vondra <tomas@vondra.me> wrote: >> I don't think there's all that much disagreement, actually. This is a >> pretty good illustration that we're using random_page_cost to account >> for things other than "I/O cost" (like the expected cache hit ratios), >> because we simply don't have a better knob for that. > > I agree with that conclusion. > >> Isn't this somewhat what effective_cache_size was meant to do? That >> obviously does not know about what fraction of individual tables is >> cached, but it does impose size limit. > > Not really, because effective_cache_size only models the fact that > when you iterate the same index scan within the execution of a single > query, it will probably hit some pages more than once. It doesn't have > any idea that anything other than an index scan might hit the same > pages more than once, and it doesn't have any idea that a query might > find data in cache as a result of previous queries. Also, when it > thinks the same page is accessed more than once, the cost of > subsequent accesses is 0. > > I could be wrong, but I kind of doubt that there is any future in > trying to generalize effective_cache_size. It's an extremely > special-purpose mechanism, and what we need here is more of a general > approach that can cut across the whole planner -- or alternatively we > can decide that things are fine and that having rpc/spc implicitly > model caching behavior is good enough. > >> I think in the past we mostly assumed we can't track cache size per >> table, because we have no visibility into page cache. But maybe direct >> I/O would change this? > > I think it's probably going to work out really poorly to try to use > cache contents for planning. The plan may easily last much longer than > the cache contents. > Why wouldn't that trigger invalidations / replanning just like other types of stats? I imagine we'd regularly collect stats about what's cached, etc. and we'd invalidate stale plans just like after ANALYZE. Just a random idea, though. We'd need some sort of summary anyway, it's not plausible each backend would collect this info on it's own. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: