Обсуждение: AW: [HACKERS] Some notes on optimizer cost estimates

Поиск
Список
Период
Сортировка

AW: [HACKERS] Some notes on optimizer cost estimates

От
Zeugswetter Andreas SB
Дата:
> > Couldn't we test some of these parameters inside configure and set
> > them there?
> 
> If we could figure out a reasonably cheap way of estimating these
> numbers, it'd be worth setting up custom values at installation time.

Imho this whole idea is not so good. (Sorry)

My points are:
1. even if it is good for an optimizer to be smart,it is even more important, that it is predictable
2. I compile on test machine, production is completely different(more processors, faster disks and controllers)

Andreas


Re: AW: [HACKERS] Some notes on optimizer cost estimates

От
Tom Lane
Дата:
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:
> My points are:
> 1. even if it is good for an optimizer to be smart,
>     it is even more important, that it is predictable

A good point indeed.  And unless we find that there is a huge range in
the ratios across different machines, we'd be wasting our time trying to
calibrate the numbers for a particular machine --- we could just as well
use an average value.  The optimizer has many other, far worse, sources
of error than that.

> 2. I compile on test machine, production is completely different
>     (more processors, faster disks and controllers)

In practice we'd do this at initdb time, not configure time, so I'm
not sure that that's really an issue.  But your other point is
well taken.
        regards, tom lane


Re: AW: [HACKERS] Some notes on optimizer cost estimates

От
Philip Warner
Дата:
At 13:17 24/01/00 -0500, Tom Lane wrote:
>Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:
>> My points are:
>> 1. even if it is good for an optimizer to be smart,
>>     it is even more important, that it is predictable
>
>A good point indeed.  And unless we find that there is a huge range in
>the ratios across different machines, we'd be wasting our time trying to
>calibrate the numbers for a particular machine --- we could just as well
>use an average value.  The optimizer has many other, far worse, sources
>of error than that.
>
>> 2. I compile on test machine, production is completely different
>>     (more processors, faster disks and controllers)
>
>In practice we'd do this at initdb time, not configure time, so I'm
>not sure that that's really an issue.  But your other point is
>well taken.

I would guess it would become an issue if a server is upgraded (better/more
disks, faster CPU etc). This could be fixed by storing the optimizer
settings in a system table in the DB, and reading them the first time a
backend opens it. Just an idea.

If you *do* go with the 'store them in the DB' solution, then you also need
to provide an way of updating them (SQL, presumably), and a utility to
refresh them based on the current hardware.

What this then amounts to is 'hand-tuning' of optimizer settings, which is
an old chestnut that I would like to see reconsidered - sometimes, for
specific important queries, it is very good to be able to tell the DB how
to go about satisfying the query (or at least, what join order to use and
which indices to scan). Is this so far removed from the above,
philosophically? Is it worth considering?


Bye for now,

Philip Warner.

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: +61-03-5367 7422            |                 _________  \
Fax: +61-03-5367 7430            |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: AW: [HACKERS] Some notes on optimizer cost estimates

От
Peter Eisentraut
Дата:
On 2000-01-24, Zeugswetter Andreas SB mentioned:

> 
> > > Couldn't we test some of these parameters inside configure and set
> > > them there?
> > 
> > If we could figure out a reasonably cheap way of estimating these
> > numbers, it'd be worth setting up custom values at installation time.
> 
> Imho this whole idea is not so good. (Sorry)
> 
> My points are:
> 1. even if it is good for an optimizer to be smart,
>     it is even more important, that it is predictable

ISTM that by the nature of things the most important capability of an
optimizer is to yield optimal results. This, however, does not have to be
mutually exclusive with predictability. If you estimate some CPU and disk
parameters and write them into a config file, then you can always give
this config file to a bug fixer. It will still work on his machine, just
less than optimally.

> 2. I compile on test machine, production is completely different
>     (more processors, faster disks and controllers)

You're completely right. This has no place in configure. It will have to
be a separate tool which you can run after building and installing.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: AW: [HACKERS] Some notes on optimizer cost estimates

От
Don Baccus
Дата:
At 12:16 AM 1/26/00 +0100, Peter Eisentraut wrote:
>On 2000-01-24, Zeugswetter Andreas SB mentioned:

>> My points are:
>> 1. even if it is good for an optimizer to be smart,
>>     it is even more important, that it is predictable
>
>ISTM that by the nature of things the most important capability of an
>optimizer is to yield optimal results.

One problem, though, is that the optimization problem's even more
intractable for database systems than it is for compilers.  Large and
growing database installations go through an evolutionary process of
adding memory, spindles, partioning of data, etc.  Changing from the
"-B" default to "-B 2000", as I've done on my web site, causes the
database to live in shared memory even when other activity on the
system would tend to cause some of it to be flushed from the filesystem
cache.  This changes how long, on average, it takes to read a block.

And the first time a table's referenced is always going to take longer
than subsequent references if there's caching involved...so in 
theory the optimizer should take that into account if it makes presumptions
about cache hit ratios.

So forth and so forth.

I'm not disagreeing in the least with the spirit of your comment.  There
are all sorts of practical barriers...

Thankfully, for the particular problem of choosing between and index
vs. sequential scan, the optimizer only has two choices to make.  Thus
a rough-and-ready heuristic based on certain assumptions might work well,
which is essentially what the optimizer does today.  In fact, it does
work quite well, come to think of it!

Fiddling the numbers underlying the assumptions may be good enough for
this task.

> This, however, does not have to be
>mutually exclusive with predictability. If you estimate some CPU and disk
>parameters and write them into a config file, then you can always give
>this config file to a bug fixer. It will still work on his machine, just
>less than optimally.
>
>> 2. I compile on test machine, production is completely different
>>     (more processors, faster disks and controllers)
>
>You're completely right. This has no place in configure. It will have to
>be a separate tool which you can run after building and installing.

Based perhaps on statistics gathered while the system is running...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.