Re: Inefficient queryplan for query with intersectable

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Inefficient queryplan for query with intersectable
Дата	26 августа 2005 г. 20:07:08
Msg-id	26436.1125096976@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Inefficient queryplan for query with intersectable (Arjen van der Meijden <acmmailing@tweakers.net>)
Ответы	Re: Inefficient queryplan for query with intersectable
Список	pgsql-performance

Дерево обсуждения

Arjen van der Meijden <acmmailing@tweakers.net> writes:
> As said, it chooses sequential scans or "the wrong index plans" over a
> perfectly good plan that is just not selected when the parameters are
> "too well tuned" or sequential scanning of the table is allowed.

I think some part of the problem comes from using inconsistent
datatypes.  For instance, it seems very odd that the thing is not
using a hash or something to handle

 t_0.Cat2 IN (SELECT 545 UNION SELECT ID FROM cat WHERE ParentID = 545)

seeing that it correctly guesses there are only going to be about 8 rows
in the union.  Part of the reason is that cat2 is smallint, whereas the
output of the union must be at least int, maybe wider depending on the
datatype of cat.id (which you did not show us); so the comparison isn't
hashable.  Even a smallint vs int comparison would be mergejoinable,
though, so I'm really wondering what cat.id is.

Another big part of the problem comes from poor result size estimation.
I'm not sure you can eliminate that entirely given the multiple
conditions on different columns (which'd require cross-column statistics
to really do well, which we do not have).  But you could avoid
constructs like

    WHERE ... t_1.recordtimestamp >=
      (SELECT max_date - 60 FROM last_dates WHERE table_name = 'pricetracker')

The planner is basically going to throw up its hands and make a default
guess on the selectivity of this; it's not smart enough to decide that
the sub-select probably represents a constant.  What I'd do with this
is to define a function marked STABLE for the sub-select result, perhaps
something like

create function get_last_date(tabname text, offsetdays int)
returns timestamp as $$
SELECT max_date - $2 FROM last_dates WHERE table_name = $1
$$ language sql strict stable;

(I'm guessing as to datatypes and the amount of parameterization you
need.)  Then write the query like

    WHERE ... t_1.recordtimestamp >= get_last_date('pricetracker', 60)

In this formulation the planner will be able to make a reasonable guess
about how many rows will match ... at least if your statistics are up
to date ...

            regards, tom lane

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Inefficient queryplan for query with intersectable