Re: Warm-up cache may have its virtue

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Warm-up cache may have its virtue
Дата
Msg-id 87k6dc325w.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на Re: Warm-up cache may have its virtue  ("Qingqing Zhou" <zhouqq@cs.toronto.edu>)
Ответы Re: Warm-up cache may have its virtue  (Qingqing Zhou <zhouqq@cs.toronto.edu>)
Список pgsql-hackers
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:

> For b1, it actually doesn't matter much though. With bitmap we definitely 
> can give a better EXPLAIN numbers for seqscan only, but without the bitmap, 
> we seldom make wrong choice of choosing/not choosing sequential scan. 

I think you have a more severe problem than that. 

It's not sequential scans that we have trouble estimating. Most of their
blocks will be uncached and they'll be read sequentially. Both of these
factors make estimating their costs pretty straightforward.

It's the index scans that are the problem. Index scans look bad to the
optimizer because they're random access, but they often have very high cache
hit rates because they access relatively few blocks and often they're hot (the
DBA did after all feel compelled to create the index in the first place).
Moreover they're often inside Nested Loop plans which causes many of those
blocks to be accessed repeatedly within the loop.

And the cache hit rate matters *a lot* for index scans since a cache hit means
the block won't be affected by the random access penalty. That is, it the
cache speedup will help both sequential and index scans but skipping the seek
only helps the index scan. 

And that's true regardless of whether it's found in Postgres's buffer cache or
has to be read in from the filesystem cache. So you won't really be able to
tell how many seeks are avoided without knowing whether the block is in the
filesystem cache. 

In other words, the difference between being in Postgres's buffer cache and
being in the filesystem cache, while not insignificant, isn't really relevant
to the planner since it affects sequential scans and index scans equally. It's
the difference between being in either cache versus requiring disk i/o that
affects index scans disproportionately.

And worse, it doesn't really matter whether it's in the cache when the query
is planned. It matters whether it'll be in the cache when the access is made.
If the node is inside a Nested Loop then subsequent trips through the loop the
same blocks may end up being read and they may all be cached.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Qingqing Zhou"
Дата:
Сообщение: Re: Warm-up cache may have its virtue
Следующее
От: Qingqing Zhou
Дата:
Сообщение: Re: Warm-up cache may have its virtue