Re: Allowing join removals for more join types

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Allowing join removals for more join types
Дата
Msg-id CAApHDvp6j4Umgw7eywKbBwcR+1s=r+df6EoukSPQ63NhFJzBkg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Allowing join removals for more join types  (Dilip kumar <dilip.kumar@huawei.com>)
Ответы Re: Allowing join removals for more join types
Список pgsql-hackers

On Mon, May 19, 2014 at 5:47 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:

On 18 May 2014 16:38 David Rowley Wrote

 

Sound like a good idea to me..

 

I have one doubt regarding the implementation, consider the below query

 

Create table t1 (a int, b int);

Create table t2 (a int, b int);

 

Create unique index on t2(b);

 

select x.a from t1 x left join (select distinct t2.a a1, t2.b b1 from t2) as y on x.a=y.b1;  (because of distinct clause subquery will not be pulled up)

 

In this case, Distinct clause is used on t2.a, but t2.b is used for left Join (t2.b have unique index so this left join can be removed).

 

So I think now when you are considering this join removal for subqueries then this can consider other case also like unique index inside subquery,

because in attached patch  unique index is considered only if its RTE_RELATION

 

+          if (innerrel->rtekind == RTE_RELATION &&

+                      relation_has_unique_index_for(root, innerrel, clause_list, NIL, NIL))

                       return true;

 

 

Correct me if I am missing something..


I think you are right here, it would be correct to remove that join, but I also think that the query in question could be quite easily be written as:

select t1.a from t1 left join t2 on t1.a=t2.b;

Where the join WILL be removed. The distinct clause here technically is a no-op due to all the columns of a unique index being present in the clause. Can you think of a use case for this where the sub query couldn't have been written out as a direct join to the relation?

What would be the reason to make it a sub query with the distinct? or have I gotten something wrong here?

I'm also thinking here that if we made the join removal code remove these joins, then the join removal code would end up smarter than the rest of the code as the current code seems not to remove the distinct clause for single table queries where a subset of the columns of a distinct clause match all the columns of a unique index.

create table pktest (id int primary key);
explain (costs off) select distinct id from pktest;
        QUERY PLAN
--------------------------
 HashAggregate
   Group Key: id
   ->  Seq Scan on pktest
 
This could have been rewritten to become: select id from pktest

I feel if we did that sort of optimisation to the join removals, then I'd guess we'd better put it into other parts of the code too, perhaps something that could do this should be in the re-writer then once the join removal code gets to it, the join could be removed.

Can you think of a similar example where the subquery could not have been written as a direct join to the relation?

Regards

David Rowley


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip kumar
Дата:
Сообщение: Re: Allowing join removals for more join types
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Re: [GENERAL] Is it typo in pg_stat_replication column name in PG 9.4 ?