Re. [HACKERS] Some notes on optimizer cost estimates

Поиск
Список
Период
Сортировка
От Xun Cheng
Тема Re. [HACKERS] Some notes on optimizer cost estimates
Дата
Msg-id 200001210219.SAA22377@xp10-06.dialup.commserv.ucsb.edu
обсуждение исходный текст
Ответы Re: Re. [HACKERS] Some notes on optimizer cost estimates  (Don Baccus <dhogaza@pacifier.com>)
Список pgsql-hackers
I'm very glad you bring up this cost estimate issue.
Recent work in database research have argued a more
detailed disk access cost model should be used for
large queries especially joins.
Traditional cost estimate only considers the number of
disk pages accessed. However a more detailed model
would consider three parameters: avg. seek, avg. latency
and avg. page transfer. For old disk, typical values are
SEEK=9.5 milliseconds, LATENCY=8.3 ms, TRANSFER=2.6ms.
A sequential continuous reading of a table (assuming
1000 continuous pages) would cost
(SEEK+LATENCY+1000*TRANFER=2617.8ms); while quasi-randomly
reading 200 times with 2 continuous pages/time would
cost (SEEK+200*LATENCY+400*TRANSFER=2700ms).
Someone from IBM lab re-studied the traditional
ad hoc join algorithms (nested, sort-merge, hash) using the detailed cost model
and found some interesting results.

>I have been spending some time measuring actual runtimes for various
>sequential-scan and index-scan query plans, and have learned that the
>current Postgres optimizer's cost estimation equations are not very
>close to reality at all.

One interesting question I'd like to ask is if this non-closeness
really affects the optimal choice of postgresql's query optimizer.
And to what degree the effects might be? My point is that
if the optimizer estimated the cost for sequential-scan is 10 and
the cost for index-scan is 20 while the actual costs are 10 vs. 40,
it should be ok because the optimizer would still choose sequential-scan
as it should.

>1. Since main-table tuples are visited in index order, we'll be hopping
>around from page to page in the table.

I'm not sure about the implementation in postgresql. One thing you might
be able to do is to first collect all must-read page addresses from 
the index scan and then order them before the actual ordered page fetching.
It would at least avoid the same page being read twice (not entirely
true depending on the context (like in join) and algo.)

>The current cost estimation
>method essentially assumes that the buffer cache plus OS disk cache will
>be 100% efficient --- we will never have to read the same page of the
>main table twice in a scan, due to having discarded it between
>references.  This of course is unreasonably optimistic.  Worst case
>is that we'd fetch a main-table page for each selected tuple, but in
>most cases that'd be unreasonably pessimistic.

This is actually the motivation that I asked before if postgresql
has a raw disk facility. That way we have much control on this cache
issue. Of course only if we can provide some algo. better than OS
cache algo. (depending on the context, like large joins), a raw disk
facility will be worthwhile (besides the recoverability).

Actually I have another question for you guys which is somehow related
to this cost estimation issue. You know the difference between OLTP
and OLAP. My question is how you target postgresql on both kinds
of applications or just OLTP. From what I know OLTP and OLAP would
have a big difference in query characteristics and thus 
optimization difference. If postgresql is only targeted on
OLTP, the above cost estimation issue might not be that
important. However for OLAP, large tables and large queries are
common and optimization would be difficult.

xun



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: vacuum failure in current sources
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Some notes on optimizer cost estimates