Обсуждение: is it possible to for the planner to optimize this form?

Поиск
Список
Период
Сортировка

is it possible to for the planner to optimize this form?

От
"Merlin Moncure"
Дата:
Right now, I am having trouble getting the planner to optimize queries
in the form of

select t.key, t.field from t a
    where
    (
        select count(*) from t b
        where b.field > a.field
    ) = k

The subplan (either index or seq. scan) executes once for each row in t,
which of course takes forever.

This query is a way of achieving LIMIT type results (substitute n-1
desired rows for k) using standard SQL, which is desirable in some
circumstances.  Is it theoretically possible for this to be optimized?

Merlin


Re: is it possible to for the planner to optimize this form?

От
Josh Berkus
Дата:
Merlin,

> select t.key, t.field from t a
>     where
>     (
>         select count(*) from t b
>         where b.field > a.field
>     ) = k
>
> The subplan (either index or seq. scan) executes once for each row in t,
> which of course takes forever.
>
> This query is a way of achieving LIMIT type results (substitute n-1
> desired rows for k) using standard SQL, which is desirable in some
> circumstances.  Is it theoretically possible for this to be optimized?

I don't think so, no.   PostgreSQL does have some issues using indexes for
count() queires which makes the situation worse.   However, with the query
you presented, I don't see any way around the planner executing the subquery
once for every row in t.

Except, of course, for some kind of scheme involving materialized views, if
you don't need up-to-the minute data.   In that case, you could store in a
table the count(*)s of t for each threshold value of b.field.  But,
dynamically, that would be even slower.

--
Josh Berkus
Aglio Database Solutions
San Francisco