Re: Implement predicate propagation for non-equivalence clauses

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: Implement predicate propagation for non-equivalence clauses
Дата
Msg-id CAN_9JTx-A0JkRBzuDprLbfft0-gFSwLwnO6j02HPjdct7hfi1A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Implement predicate propagation for non-equivalence clauses  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers

On Wed, Sep 5, 2018 at 2:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/09/18 09:34, Richard Guo wrote:
Hi,

As we know, current planner will generate additional restriction clauses from
equivalence clauses. This will generally lower the total cost because some of
tuples may be filtered out before joins.

In this patch, we are trying to do the similar deduction, from non-equivalence
clauses, that is, A=B AND f(A) implies A=B AND f(A) and f(B), under some
restrictions on f.

I haven't read the patch in detail, but that really only applies under special circumstances. Tom caught me making that assumption just recently (https://www.postgresql.org/message-id/8003.1527092720%40sss.pgh.pa.us). I think the restriction here is that f(x) must be an operator that's in the same operator family as the = operator. In a quick read-through, it's not clear to me what conditions are in the patch now. Please have a comment somewhere to list them explicitly.

Right. Above all the operator in f(x) should be in the same opfamily as the equivalence class. We neglected that in this patch and it would result in wrong plan. In addition, it should not contain volatile functions or subplans. Will address this in v2 and list the conditions in comment. Thanks!



This patch will introduce extra cost for relation scan, due to the
cost of evaluating the new implied quals. Meanwhile, since the extra
filter may reduce the number of tuples returned by the scan, it may
lower the cost of following joins. So, whether we will get a better
plan depends on the selectivity of the implied quals.
Perhaps we should evaluate the selectivity of the clause, and only add them if they seem helpful, based on the cost vs. selectivity?

At least in this case from the regression tests:

 explain (costs off)
   select * from ec0 a, ec1 b
   where a.ff = b.ff and a.ff = 43::bigint::int8alias1;
-                 QUERY PLAN                  ----------------------------------------------
+                              QUERY PLAN                              +----------------------------------------------------------------------
  Nested Loop
    ->  Index Scan using ec0_pkey on ec0 a
          Index Cond: (ff = '43'::int8alias1)
    ->  Index Scan using ec1_pkey on ec1 b
          Index Cond: (ff = a.ff)
-         Filter: (f1 < '5'::int8alias1)
+         Filter: ((f1 < '5'::int8alias1) AND (ff = '43'::int8alias1))
 (6 rows)

the new qual is redundant with the Index Condition. If we could avoid generating such redundant quals, that would be good.

Nice point. I am not sure how complex to evaluate the selectivity of the new qual before applying it. But that deserves a try.



- Heikki

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: JIT compiling with LLVM v12
Следующее
От: Richard Guo
Дата:
Сообщение: Re: Implement predicate propagation for non-equivalence clauses