Re: index-only scans

Поиск
Список
Период
Сортировка
От Cédric Villemain
Тема Re: index-only scans
Дата
Msg-id CAF6yO=06JCc_-z8_Kh418-Xr3Hs9XU7g3hEw7tjjmHrA5m54-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: index-only scans  (Anssi Kääriäinen <anssi.kaariainen@thl.fi>)
Список pgsql-hackers
2011/8/16 Anssi Kääriäinen <anssi.kaariainen@thl.fi>:
> On 08/14/2011 12:31 AM, Heikki Linnakangas wrote:
>>>
>>> The same idea could of course be used to calculate the effective cache
>>> hit ratio for each table. Cache hit ratio would have the problem of feedback
>>> loops, though.
>>
>> Yeah, I'm not excited about making the planner and statistics more
>> dynamic. Feedback loops and plan instability are not fun.
>
> I might be a little out of my league here... But I was thinking about the
> cache hit ratio and feedback loops. I understand automatic tuning would be
> hard. But making automatic tuning easier (by using pg_tune for example)
> would be a big plus for most use cases.
>
> To make it easier to tune the page read costs automatically, it would be
> nice if there would be four variables instead of the current two:
>  - random_page_cost is the cost of reading a random page from storage.
> Currently it is not, it is the cost of accessing a random page, taking in
> account it might be in memory.
>  - seq_page_cost is the cost of reading pages sequentially from storage
>  - memory_page_cost is the cost of reading a page in memory
>  - cache_hit_ratio is the expected cache hit ratio
>
> memory_page_cost would be server global, random and seq page costs
> tablespace specific, and cache_hit_ratio relation specific. You would get
> the current behavior by tuning *_page_costs realistically, and setting
> cache_hit_ratio globally so that the expected random_page_cost /
> seq_page_cost stays the same as now.
>
> The biggest advantage of this would be that the correct values are much
> easier to detect automatically compared to current situation. This can be
> done using pg_statio_* views and IO speed testing. They should not be tuned
> automatically by PostgreSQL, at least not the cache_hit_ratio, as that leads
> to the possibility of feedback loops and plan instability. The variables
> would also be much easier to understand.
>
> There is the question if one should be allowed to tune the *_page_costs at
> all. If I am not missing something, it is possible to detect the correct
> values programmatically and they do not change if you do not change the
> hardware. Cache hit ratio is the real reason why they are currently so
> important for tuning.
>
> An example why the current random_page_cost and seq_page_cost tuning is not
> adequate is that you can only set random_page_cost per tablespace. That
> makes perfect sense if random_page_cost would be the cost of accessing a
> page in storage. But it is not, it is a combination of that and caching
> effects, so that it actually varies per relation (and over time). How do you
> set it correctly for a query where one relation is fully cached and another
> one not?
>
> Another problem is that if you use random_page_cost == seq_page_cost, you
> are effectively saying that everything is in cache. But if everything is in
> cache, the cost of page access relative to cpu_*_costs is way off. The more
> random_page_cost and seq_page_cost are different, the more they mean the
> storage access costs. When they are the same, they mean the memory page
> cost. There can be an order of magnitude in difference of a storage page
> cost and a memory page cost. So it is hard to tune the cpu_*_costs
> realistically for cases where sometimes data is in cache and sometimes not.
>
> Ok, enough hand waving for one post :) Sorry if this all is obvious /
> discussed before. My googling didn't turn out anything directly related,
> although these have some similarity:
>  - Per-table random_page_cost for tables that we know are always cached
> [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01503.php]
>  - Script to compute random page cost
> [http://archives.postgresql.org/pgsql-hackers/2002-09/msg00503.php]
> -  The science of optimization in practical terms?
> [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php], getting
> really interesting starting from here:
> [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00787.php]

late reply.
You can add this link to your list:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php


--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Следующее
От: Thom Brown
Дата:
Сообщение: Re: [pgsql-advocacy] Unlogged vs. In-Memory