On Wed, 2002-04-17 at 17:16, Tom Lane wrote:
>
> It's entirely possible that the default value of random_page_cost is too
> high, at least for many modern machines. The experiments I did to get
> the 4.0 figure were done a couple years ago, on hardware that wasn't
> exactly new at the time. I have not heard of anyone else trying to
> measure it though.
>
> I don't think I have the source code I used anymore, but the principle
> is simple enough:
>
> 1. Make a large file (several times the size of your machine's RAM, to
> ensure you swamp out kernel disk buffering effects). Fill with random
> data. (NB: do not fill with zeroes, some filesystems optimize this away.)
People running postgres often already have large files of random data
under $PGDATA directory :)
> 2. Time reading the file sequentially, 8K per read request.
> Repeat enough to get a statistically trustworthy number.
>
> 3. Time reading randomly-chosen 8K pages from the file. Repeat
> enough to get a trustworthy number (the total volume of pages read
> should be several times the size of your RAM).
>
> 4. Divide.
>
> The only tricky thing about this is making sure you are measuring disk
> access times and not being fooled by re-accessing pages the kernel still
> has cached from a previous access. (The PG planner does try to account
> for caching effects, but that's a separate estimate; the value of
> random_page_cost isn't supposed to include caching effects.) AFAIK the
> only good way to do that is to use a large test, which means it takes
> awhile to run; and you need enough spare disk space for a big test file.
If you have the machine all for yourself you can usually tell it to use
less RAM at boot time.
On linux it is append=" mem=32M" switch in lilo.conf or just mem=32M on
lilo boot command line.
> It'd be interesting to get some numbers for this across a range of
> hardware, filesystems, etc ...
---------------
Hannu