Re: More thoughts about planner's cost estimates

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: More thoughts about planner's cost estimates
Дата
Msg-id 1149362295.3889.29.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Re: More thoughts about planner's cost estimates  (Greg Stark <gsstark@mit.edu>)
Ответы Re: More thoughts about planner's cost estimates
Список pgsql-hackers
Ühel kenal päeval, R, 2006-06-02 kell 16:23, kirjutas Greg Stark:

> And a 5% sample is a pretty big. In fact my tests earlier showed the i/o from
> 5% block sampling took just as long as reading all the blocks. Even if we
> figure out what's causing that (IMHO surprising) result and improve matters I
> would only expect it to be 3-4x faster than a full scan.

You should not be surprised by this once you visualise what happens at
the disk level with all those platters spinning and heads moving :) 

Disks can read at full rotation speed, so skipping (not reading) some
blocks will not make reading the remaining blocks from the same track
faster. And if there are more than 20 8k pages per track, you still have
a very high probablility you need to read all tracks..

You may be able to move to the next track a little earlier compared to
reading all blocks, but then you are likely to miss the block from next
track and have to wait a full rotation.

You will get some win from skipping pages only once your % falls so low
that you can also skip a significant number of tracks.

> http://archives.postgresql.org/pgsql-hackers/2006-01/msg00285.php

Your test program could have got a little better results, if you had
somehow managed to tell the system all the block numbers to read in one
go, not each time the next one after hetting the previous one. In
current version it is quite likely that it had to wait several disk
rotations for even the sectors from the same track, as for small steps
it may have missed the next sector. It does not apply for disks which
always read a full track in RAM cache, but even there all tracks are
actually read.

The fact that 5% was not slower than seqscan seems to indicate that
actually all track reads were cached inside the disk or controller.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: COPY (query) TO file
Следующее
От: Mike Benoit
Дата:
Сообщение: Re: More thoughts about planner's cost estimates