Re: [SQL] Search optimisation

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [SQL] Search optimisation
Дата
Msg-id 17438.945241027@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Search optimisation  (Olwen Williams <olwen@ihug.co.nz>)
Список pgsql-sql
Olwen Williams <olwen@ihug.co.nz> writes:
> [ why is query A so much faster than query B? ]

Hmm.  The system is correctly estimating that query A is much
faster than B:

> Nested Loop  (cost=54.91 rows=657 width=340)               ^^^^^^^^^^
vs
>  Hash Join  (cost=11830.17 rows=981357 width=340)              ^^^^^^^^^^^^^

so the question is why it thought this hashjoin plan was the best
available option for query B.  It looks to me like the problem has
to do with the estimated selectivities of the two restrictions on the
biblioitems table.  Query A recognizes that biblionumber='109' selects
just one item from biblioitems:

-> Index Scan using bibitbnoidx on biblioitems  (cost=2.03 rows=1 width=102)
             ^^^^^^
 

so it produces a good plan, even though a doubly nested loop would
be an extremely horrible plan if there were a lot of rows involved.
(Actually, this is the first complaint about bad plans I've ever seen
in which doubly-nested-loop was the *right* choice ;-) ...)

Meanwhile, query B thinks that isbn='031051911X' is going to accept
several thousand rows from biblioitems:

-> Index Scan using isbnidx on biblioitems (cost=203.35 rows=2988 width=102)
          ^^^^^^^^^
 

and because of that, it's choosing a plan that would be well adapted for
a lot of matching rows, yet is overkill (read inefficient) for a few
rows.

Now, I'm going to read between the lines and guess that the restriction
on isbn should yield only one biblioitems row; if this guess is wrong,
you can ignore all that follows.

If my guess is correct, then the blame lies with misestimation of the
selectivity of the isbn='XXX' clause.  You can double-check this by
doing 
explain select * from biblioitems where isbn='031051911X';

and seeing whether the estimated row count is close to what you
actually get from doing this query.

The first question has to be "have you done a VACUUM ANALYZE lately?".
If not, the planner is working in the dark about the distribution of
isbn values, and you can't fairly blame it for guessing conservatively
that it's going to have to deal with a lot of rows.  However, if you
have done a VACUUM ANALYZE and still get a ridiculous estimate, then
that's a bug that I'd like to try to fix.  Please contact me off-list
and we can pursue the details of why this is happening.
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: Olwen Williams
Дата:
Сообщение: Search optimisation
Следующее
От: UEBAYASHI Masao
Дата:
Сообщение: Re: [SQL] numbered table?