Re: Two "equivalent" WITH RECURSIVE queries, one of them slow.

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Two "equivalent" WITH RECURSIVE queries, one of them slow.
Дата
Msg-id AANLkTinE_BZ4QSo-uZcQBKzPTSPcYKQ0wOS81v0Jrkwd@mail.gmail.com
обсуждение исходный текст
Ответ на Two "equivalent" WITH RECURSIVE queries, one of them slow.  ("Octavio Alvarez" <alvarezp@alvarezp.ods.org>)
Список pgsql-performance
On Mon, Jul 5, 2010 at 2:07 AM, Octavio Alvarez
<alvarezp@alvarezp.ods.org> wrote:
> Hello.
>
> I have a tree-like table with a three-field PK (name, date, id) and one
> parent field.
> It has 5k to 6k records as of now, but it will hold about 1 million records.
>
> I am trying the following WITH RECURSIVE query:
>
> WITH RECURSIVE t AS (
>                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
> par.h_title, par.h_name, par.parent
>                   FROM _books.par
>        UNION
>                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
> p.h_title, p.h_name, p.parent
>                   FROM t, _books.par p
>                  WHERE p.name = t.name AND p.date = t.date AND t.id =
> p.parent
>        )
>  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
>   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';
>
> ... which takes 2547.503 ms
>
> However, if I try the same query but adding the same WHERE clause to the
> non-recursive term, I get much better results.
>
>
> WITH RECURSIVE t AS (
>                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
> par.h_title, par.h_name, par.parent
>                   FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19'
> AND par.id = '28340'
>        UNION
>                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
> p.h_title, p.h_name, p.parent
>                   FROM t, _books.par p
>                  WHERE p.name = t.name AND p.date = t.date AND t.id =
> p.parent
>        )
>  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
>   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';
>
> ... which takes 0.221 ms
>
> I am being forced to use the slow query because I want to define it as a
> view, leaving the WHERE clause to the application.

I think this is just a limitation of the optimizer.  Recursive queries
are a relatively new feature and the optimizer doesn't know a whole
lot about how to deal with them.  That may get improved at some point,
but the optimization you're hoping for here is pretty tricky.  In
order to push the WHERE clauses down into the non-recursive term, the
optimizer would need to prove that this doesn't change the final
results.  I think that's possible here because it so happens that your
recursive term only generates results that have the same name, date,
and tid as some existing result, but with a slightly different
recursive query that wouldn't be true, so you'd need to make the code
pretty smart to work this one out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Highly Efficient Custom Sorting
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Question about partitioned query behavior