Re: Patch to support SEMI and ANTI join removal

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Patch to support SEMI and ANTI join removal
Дата
Msg-id CA+U5nM+S5uwN_xG5U4e049J3JB6ZhEhSm143f1SLreEnvunmAg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Patch to support SEMI and ANTI join removal  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Patch to support SEMI and ANTI join removal  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
On 15 October 2014 11:03, David Rowley <dgrowleyml@gmail.com> wrote:

> The explain analyze from the above query looks like:
> test=# explain (analyze, costs off, timing off) select count(*) from t1
> inner join t2 on t1.t2_id=t2.id;
>                             QUERY PLAN
> ------------------------------------------------------------------
>  Aggregate (actual rows=1 loops=1)
>    ->  Nested Loop (actual rows=1000000 loops=1)
>          ->  Seq Scan on t1 (actual rows=1000000 loops=1)
>          ->  Index Only Scan using t2_pkey on t2 (never executed)
>                Index Cond: (id = t1.t2_id)
>                Heap Fetches: 0
>  Execution time: 124.990 ms
> (7 rows)
>
> As you can see the scan on t2 never occurred.

Very good, happy to see this happening (yay FKs!) and with
PostgreSQL-style rigour.

I've reviewed the patch from cold and I have a few comments.

The plan you end up with here works quite differently from left outer
join removal, where the join is simply absent. That inconsistency
causes most of the other problems I see.

I propose that we keep track of whether there are any potentially
skippable joins at the top of the plan. When we begin execution we do
a single if test to see if there is run-time work to do. If we pass
the run-time tests we then descend the tree and prune the plan to
completely remove unnecessary nodes. We end with an EXPLAIN and
EXPLAIN ANALYZE that looks like this

>                             QUERY PLAN
> ------------------------------------------------------------------
>  Aggregate (actual rows=1 loops=1)
>          ->  Seq Scan on t1 (actual rows=1000000 loops=1)

Doing that removes all the overheads and complexity; it also matches
how join removal currently works.

The alternative is accepting some pretty horrible additional code in
most join types, plus a small regression on nested loop joins which I
would have to call out as regrettably unacceptable. (Horrible in this
sense that we don't want that code, not that David's code is poor).

The tests on the patch are pretty poor. If we should use EXPLAINs to
show a join removal that works and a join removal that fails. With a
few of the main permutations.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabrízio de Royes Mello
Дата:
Сообщение: Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: 9.5: Better memory accounting, towards memory-bounded HashAgg