Re: Tuning planner cost estimates

Поиск
Список
Период
Сортировка
От Jim C. Nasby
Тема Re: Tuning planner cost estimates
Дата
Msg-id 20050520224917.GN44623@decibel.org
обсуждение исходный текст
Ответ на Re: Tuning planner cost estimates  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-performance
On Fri, May 20, 2005 at 03:23:16PM -0700, Josh Berkus wrote:
> Jim,
>
> > Well, that raises an interesting issue, because AFAIK none of the cost
> > estimate functions currently do that. Heck, AFAIK even the piggyback
> > seqscan code doesn't take other seqscans into account.
>
> Sure.   But you're striving for greater accuracy, no?
>
> Actually, all that's really needed in the way of concurrent activity is a
> calculated factor that lets us know how likely a particular object is to be
> cached, either in the fs cache or the pg cache (with different factors for
> each presumably) based on history.   Right now, that's based on
> estimated_cache_size, which is rather innacurate: a table which is queried
> once a month has the exact same cost factors as one which is queried every
> 2.1 seconds.  This would mean an extra column in pg_stats I suppose.

True, though that's a somewhat different issue that what the load on the
box is (see the reply I just posted). Load on the box (particuarly IO
load) will also play a factor for things; for example, it probably means
seqscans end up costing a lot more than random IO does, because the disk
heads are being sent all over the place anyway.

> > But ultimately, I'm not sure if this is really required or not, because
> > I don't see that we need to use explain when running queries. In fact,
> > it's possibly desireable that we don't, because of the overhead it
> > incurs. We would want to log an explain (maybe analyze) just to make
> > sure we knew what the optimizer was doing, but I think we shouldn't need
> > the info to produce cost estimates.
>
> Well, the problem is that you need to know how much time the index scan took
> vs. other query steps.   I don't see a way to do this other than an anayze.

True, but that can be done by a seperate seqscan step. I would argue
that doing it that way is actually more accurate, because the overhead
of explain analyze is huge and tends to swamp other factors out. As I
mentioned in my other email, my tests show explain analyze select * from
table is 5x slower than select count(*) from table.
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: Tuning planner cost estimates
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Tuning planner cost estimates