Re: [SQL] optimizer not using an index...

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [SQL] optimizer not using an index...
Дата
Msg-id 6737.935765619@sss.pgh.pa.us
обсуждение исходный текст
Ответ на optimizer not using an index...  (Howie <caffeine@toodarkpark.org>)
Ответы Re: [SQL] optimizer not using an index...  (Howie <caffeine@toodarkpark.org>)
Список pgsql-sql
Howie <caffeine@toodarkpark.org> writes:
> explain reveals that postgres ( 6.5.0 ) isnt using some of my indexes,
> opting instead for a complete table scan ( and drastically slowing things
> down ).

Well, mumble.  The optimizer certainly needs work, but what's your
evidence that this query would be faster if done another way?  Hashing
the smaller tables, as it's doing, ought to be a pretty good strategy.

One way to check is to start your client with environment variable
setting PGOPTIONS="-fh" ("forbid hashjoin") to discourage the optimizer
from using hashes, then check the generated plan for the same query and
see what its actual runtime is.  That's likely to be a suboptimal plan
however, since it'll turn off *all* hashing.  The hash on customers is
probably the thing that's bothering you.

How many result rows do you actually get from this query?  If you
eliminate the customers table from the query, and just do the same join
among the remaining tables, how many rows do you get?  I suspect the
optimizer is drastically off in its estimate of ~300k result rows, and
that's contributing to the problem.

> doing a complete table scan on a column thats indexed isnt really that
> nice, especially since there are 12,000 entries in it.

But it's estimating it's going to have to probe that table 300k times,
which makes the hashjoin look mighty attractive...

> interestingly, when querying on "pincodes.code" instead of
> "customers.name", postgres does NOT use a full table scan; it uses the
> proper indexes:

> Hash Join  (cost=23.78 rows=5 width=222)

Note the drastic difference in the estimated result-row count; that's
undoubtedly what's changing the optimizer's choice of what to do.  You
haven't given us enough detail to understand why this query would be
(or at least seem) more selective than the other, however.

Anyway, this looks to me like it is probably a symptom of poor
selectivity estimation leading to bogus estimates of output row counts
leading to a nonoptimal plan choice.  I have been working on improving
the selectivity estimation for 6.6, and am looking for test cases to
check out the logic on.  Is your database small enough/non proprietary
enough that you could send me a dump?  Or could you strip it down to
a test case that still exhibits the same misbehavior?  If you don't
like either of those, perhaps you could grab a current snapshot, install
your data in a test postmaster, and report back on whether it acts any
different...
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [SQL] Inheritance and DELETE
Следующее
От: Howie
Дата:
Сообщение: Re: [SQL] optimizer not using an index...