Re: random_page_cost vs seq_page_cost

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: random_page_cost vs seq_page_cost
Дата
Msg-id 4F31A05A.1060506@2ndQuadrant.com
обсуждение исходный текст
Ответ на Re: random_page_cost vs seq_page_cost  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: random_page_cost vs seq_page_cost  (Bruce Momjian <bruce@momjian.us>)
Re: random_page_cost vs seq_page_cost  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On 02/07/2012 03:23 PM, Bruce Momjian wrote:
> Where did you see that there will be an improvement in the 9.2
> documentation?  I don't see an improvement.

I commented that I'm hoping for an improvement in the documentation of 
how much timing overhead impacts attempts to measure this area better.  
That's from the "add timing of buffer I/O requests" feature submission.  
I'm not sure if Bene read too much into that or not; I didn't mean to 
imply that the docs around random_page_cost have gotten better.

This particular complaint is extremely common though, seems to pop up on 
one of the lists a few times each year.  Your suggested doc fix is fine 
as a quick one, but I think it might be worth expanding further on this 
topic.  Something discussing SSDs seems due here too.  Here's a first 
draft of a longer discussion, to be inserted just after where it states 
the default value is 4.0:

True random access to mechanical disk storage will normally be more 
expensive than this default suggests.  The value used is lower to 
reflect caching effects.  Some common random accesses to disk, such as 
indexed reads, are considered likely to be in cache.  The default value 
can be thought of as modeling random access as 40 times as expensive as 
sequential, while expecting that 90% of random reads will actually be 
cached.

If you believe a high cache rate is an incorrect assumption for your 
workload, you might increase random_page_cost to closer reflect the true 
cost of random reads against your storage.  Correspondingly, if your 
data is likely to be completely cached, such as when the database is 
smaller than the total memory in the server, decreasing random_page_cost 
can be appropriate.  Storage where the true cost of random reads is low, 
such as solid-state drives and similar memory-based devices, might also 
find lower values of random_page_cost better reflect the real-world cost 
of that operation.

===

I think of the value as being more like 80 times as expensive and a 95% 
hit rate, but the above seems more likely to turn into understandable 
math to a first-time reader of this section.  I stopped just short of 
recommending a value for the completely cached case.  I normally use 
1.01 there; I know others prefer going fully to 1.0 instead.  That 
argument seems like it could rage on for some time.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: [GENERAL] pg_dump -s dumps data?!
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pgindent README correction