On Wed, Dec 2, 2015 at 8:42 PM, Bruce Momjian <bruce@momjian.us> wrote:
> No one mentioned the random page docs so I will quote it here:
>
> http://www.postgresql.org/docs/9.5/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
>
> Random access to mechanical disk storage is normally much more expensive
> than four times sequential access. However, a lower default is used
> (4.0) because the majority of random accesses to disk, such as indexed
> reads, are assumed to be in cache. The default value can be thought of
> as modeling random access as 40 times slower than sequential, while
> expecting 90% of random reads to be cached.
>
> If you believe a 90% cache rate is an incorrect assumption for your
> workload, you can increase random_page_cost to better reflect the true
> cost of random storage reads. Correspondingly, if your data is likely to
> be completely in cache, such as when the database is smaller than the
> total server memory, decreasing random_page_cost can be appropriate.
> Storage that has a low random read cost relative to sequential, e.g.
> solid-state drives, might also be better modeled with a lower value for
> random_page_cost.
>
> What we don't have is way to know how much is in the cache, not only at
> planning time, but at execution time. (Those times are often
> different for prepared queries.) I think that is the crux of what has
> to be addressed here.
I think that paragraph is more of an apology for the system that we've
got than a description of what a good one would look like. If I have
a 1MB table and a 1TB, they are not equally likely to be cached.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company