Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата
Msg-id CAKU4AWpXSryGjHO6Mgs5GcJVQyE725SN9SJujQb4VsLhVRTAQA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Список pgsql-hackers
Hi David:

Thanks for your time. 
 
1. Out of date comment in join.sql

-- join removal is not possible when the GROUP BY contains a column that is
-- not in the join condition.  (Note: as of 9.6, we notice that b.id is a
-- primary key and so drop b.c_id from the GROUP BY of the resulting plan;
-- but this happens too late for join removal in the outer plan level.)
explain (costs off)
select d.* from d left join (select d, c_id from b group by b.d, b.c_id) s
  on d.a = s.d;

You've changed the GROUP BY clause so it does not include b.id, so the
Note in the comment is now misleading.

Thanks, I will fix this one in the following patch. 
 

2. I think 0002 is overly restrictive in its demands that
parse->hasAggs must be false. We should be able to just use a Group
Aggregate with unsorted input when the input_rel is unique on the
GROUP BY clause.  This will save on hashing and sorting.  Basically
similar to what we do for when a query contains aggregates without any
GROUP BY.


Yes,  This will be a perfect result,  the difficult is the current aggregation function
execution is highly coupled with Agg node(ExecInitAgg) which is removed in the
unique case.  I ever make the sum (w/o finalfn) and avg(with finalfn)
works in a hack way, but still many stuffs is not handled.  Let me prepare the code
for this purpose in 1~2  days to see if I'm going with the right direction. 

Ashutosh also has an idea[1] that if the relation underlying an Agg node is 
known to be unique for given groupByClause, we could safely use 
AGG_SORTED strategy. Though the input is not ordered, it's sorted thus for every row Agg
node will combine/finalize the aggregate result.  

I will target the perfect result first and see how many effort do we need, if not,
I will try Ashutosh's suggestion. 

 
3. I don't quite understand why you changed this to a right join:

 -- Test case where t1 can be optimized but not t2
 explain (costs off) select t1.*,t2.x,t2.z
-from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y
+from t1 right join t2 on t1.a = t2.x and t1.b = t2.y

Perhaps this change is left over from some previous version of the patch?

This is on purpose.   the original test case is used to test we can short
the group key for t1 but not t2 for aggregation, but if I keep the inner join, the 
aggnode will be removed totally, so I have to change it to right join in order
to keep the aggnode.  The full test case is:

-- Test case where t1 can be optimized but not t2

explain (costs off) select t1.*,t2.x,t2.z

from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y

group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.z;


where (a, b) is the primary key of t1. 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: wrong relkind error messages
Следующее
От: Mark Dilger
Дата:
Сообщение: Re: Perl modules for testing/viewing/corrupting/repairing your heapfiles