Re: Help with unpredictable use of indexes on large tables...

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Help with unpredictable use of indexes on large tables...
Дата
Msg-id 16624.1146180115@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Help with unpredictable use of indexes on large tables...  (Mike Leahy <mgleahy@alumni.uwaterloo.ca>)
Ответы Re: Help with unpredictable use of indexes on large tables...  (Mike Leahy <mgleahy@alumni.uwaterloo.ca>)
Re: Help with unpredictable use of indexes on large tables...  ("John D. Burger" <john@mitre.org>)
Список pgsql-general
Mike Leahy <mgleahy@alumni.uwaterloo.ca> writes:
> ... When I try to get the distinct number of
> years from these tables, it does a sequential scan to get two unique
> values from the "year" column in the *_edu table, but it uses an index
> scan to get a single unique value from the "year" column from the *_con
> table.  In both cases, I would have expected the index scan to be used.

You have a fundamental misunderstanding of what's going on here.  Both
plans fetch the entire table contents.  The difference is how the data
is brought into sorted order for the UNIQUE step --- either by an
explicit sort, or by scanning the table in index order.

A full-table index scan is usually pretty darn inefficient (too much
random access) and so it's often the case that the sort approach is
actually faster.

The two EXPLAINs you provided aren't compelling evidence of anything
wrong because they are for two different-sized tables ... but to the
extent that the results are comparable it appears that the planner is
actually biased in favor of the indexscan plan (cost divided by actual
time is way lower for the indexscan).

What you should look at is performance of the two approaches on the
*same* table (fool with enable_sort and/or enable_indexscan to force
the alternative choices) and then see whether it makes sense to tweak
the planner cost parameters for your installation.

Also I'd suggest trying

    select year from [table] group by year

which is capable of using a hash aggregation approach; that will likely
beat either of these plans.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Mike Leahy
Дата:
Сообщение: Help with unpredictable use of indexes on large tables...
Следующее
От: Benjamin Smith
Дата:
Сообщение: Checking for Foreign Keys constraining a record?