[PATCH] Equivalence Class Filters

Поиск
Список
Период
Сортировка
От David Rowley
Тема [PATCH] Equivalence Class Filters
Дата
Msg-id CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PATCH] Equivalence Class Filters  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

I'd like to gather up what the community interest might be for this patch. Let me explain:

As of today Equivalence Classes are used in the query planner to allow the planner to have more flexibility into the join order by collecting information to describe which expressions are equal to each other. These Equivalence classes, when they contain a constant value also allow predicate push down. For example:

# explain select * from t1 inner join t2 on t1.id = t2.id where t1.id=1;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Nested Loop  (cost=0.56..12.61 rows=1 width=12)
   ->  Index Scan using t1_pkey on t1  (cost=0.29..8.30 rows=1 width=8)
         Index Cond: (id = 1)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.28..4.29 rows=1 width=4)
         Index Cond: (id = 1)
(5 rows) 

We can see that a qual was added to filter t2.id=1.

As of today these Equivalence Classes only incorporate expressions which use the equality operator, but what if the above query had been written as:

select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10;

Should we not be able to assume that t2.id is also <= 10? Currently we don't, but the attached patch expands to add what I've called Equivalence Filters to Equivalence Classes.

This allows the above query to produce a plan like:

                                  QUERY PLAN
------------------------------------------------------------------------------
 Merge Join  (cost=0.56..5.68 rows=1 width=12)
   Merge Cond: (t1.id = t2.id)
   ->  Index Scan using t1_pkey on t1  (cost=0.29..8.44 rows=9 width=8)
         Index Cond: (id <= 10)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.28..4.45 rows=10 width=4)
         Index Cond: (id <= 10)

Now, it may not be that common to perform range filters on an id column, but it can be fairly common for date values to be join conditions and have date range filters too. For example in [1] Alexandre claimed a 1100 to 2200 performance improvement after manually pushing the date filter into the subquery. This patch allows this to become automatic.

This all works by simply just collecting OpExprs during building the EquivalanceClasses which have previously been rejected for the eqclass because they don't use an equality operator. OpExprs in the form of "Expr op Const" and "Const op Expr" are collected, and then once the EquivalanceClasses have been build the "Expr" is searched for in the built classes to see if we can find that Expr as a member of a class, if so this then becomes an EquivalenceFilter and gets tagged onto to the EquivalenceClass.

Patch Status - I'm a bit uncertain as to how far we can go with this and if we deem this as a good idea, then we'd need to be careful not to cause any performance regression. For example if the filter was "somestaticfunc(col) < 1234", then we might not want to push that down as somestaticfunc() might be so costly, that it might be better to perform the join with all the rows instead.  For this reason I've limited the current patch to only using Operators which are members of a btree op class. Perhaps we could do more, but likely there's less of a win to be had due to less chance of that qual being useful for an index scan.

Writing this patch has been on my "interesting" list for a while now. I got some time last weekend to finally write it, but due to that happening at the weekend rather than in work time it's a low priority for me to. However there's no sense in it gathering dust, so I'm posting it today.

Is this something that we might want?



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: [patch] Proposal for \rotate in psql
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.