is there any way we can push join predicate into inner table

Поиск
Список
Период
Сортировка
От 挨踢人
Тема is there any way we can push join predicate into inner table
Дата
Msg-id tencent_38B35CB76BAA9EC85B7BCF20@qq.com
обсуждение исходный текст
Ответы Re: is there any way we can push join predicate into inner table
Список pgsql-performance
hi,We are trying to use PG FDW to read data from external tables.
We’d like to use PG as a SQL parser and push predicates into backend NoSQL databases.
It works well for single table queries. PG FDW is able to push all required predicates into backend database.
However, things go odd after introducing join operation. For example, if we use TPCH as a test case, with normal local tables and foreign tables:
tpch=# \d                                                                                                                                                                                                                                                                                                                                                                                                                              
               List of relations                                                                                                                                                                                                                                                                                                                                                                                                       
 Schema |    Name    |     Type      |  Owner                                                                                                                                                                                                                                                                                                                                                                                          
--------+------------+---------------+----------                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
 public | lineitem   | foreign table | sdbadmin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
 public | orders     | foreign table | sdbadmin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
 public | t_lineitem | table         | sdbadmin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
 public | t_orders   | table         | sdbadmin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
(4 rows)                                                                                                                                                                                                                                                                                                                                                                                                                              
 
The access plan for local table access works great. We can see that it use NL Join and pushed join predicate into inner table since there’s index exist (Index Cond: (l_orderkey = t_orders.o_orderkey)):                                                                                                                                                                                                                                                                                                                                                                                                                                       
tpch=#                                                                                                                                                                                                                                                                                                                                                                                                                                 
Explain analyze select count(*) from( select o_orderkey from orders where o_custkey=28547 ) AS T, lineitem as l where T.o_orderkey=l.l_orderkey;                                                                                                                                                                                                                                                                                       
                                                                      QUERY PLAN                                                                                                                                                                                                                                                                                                                                                       
------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                 
 Aggregate  (cost=5575229.23..5575229.24 rows=1 width=0) (actual time=67169.270..67169.271 rows=1 loops=1)                                                                                                                                                                                                                                                                                                                             
   ->  Merge Join  (cost=1621891.36..5012615.33 rows=225045562 width=0) (actual time=64543.730..67169.247 rows=31 loops=1)                                                                                                                                                                                                                                                                                                             
         Merge Cond: (orders.o_orderkey = l.l_orderkey)                                                                                                                                                                                                                                                                                                                                                                                
         ->  Sort  (cost=79230.48..79249.23 rows=7500 width=4) (actual time=0.427..0.433 rows=7 loops=1)                                                                                                                                                                                                                                                                                                                               
               Sort Key: orders.o_orderkey                                                                                                                                                                                                                                                                                                                                                                                             
               Sort Method: quicksort  Memory: 25kB                                                                                                                                                                                                                                                                                                                                                                                    
               ->  Foreign Scan on orders  (cost=0.00..78747.75 rows=7500 width=4) (actual time=0.217..0.364 rows=7 loops=1)                                                                                                                                                                                                                                                                                                           
                     Filter: (o_custkey = 28547)                                                                                                                                                                                                                                                                                                                                                                                       
                     Foreign Namespace: tpch.orders                                                                                                                                                                                                                                                                                                                                                                                    
         ->  Materialize  (cost=1542660.89..1572666.96 rows=6001215 width=4) (actual time=64543.249..66281.325 rows=4028971 loops=1)                                                                                                                                                                                                                                                                                                   
               ->  Sort  (cost=1542660.89..1557663.92 rows=6001215 width=4) (actual time=64543.234..65404.116 rows=4028971 loops=1)                                                                                                                                                                                                                                                                                                    
                     Sort Key: l.l_orderkey                                                                                                                                                                                                                                                                                                                                                                                            
                     Sort Method: external sort  Disk: 82136kB                                                                                                                                                                                                                                                                                                                                                                         
                     ->  Foreign Scan on lineitem l  (cost=0.00..620867.90 rows=6001215 width=4) (actual time=35.438..50376.884 rows=6001215 loops=1)                                                                                                                                                                                                                                                                                  
                           Foreign Namespace: tpch.lineitem                                                                                                                                                                                                                                                                                                                                                                            
 Total runtime: 67191.867 ms                                                                                                                                                                                                                                                                                                                                                                                                           
(16 rows)                                                                                                                                                                                                                                                                                                                                                                                                                              
However, if we use foreign tables, since PG doesn’t know whether index exist in the inner table, it has to perform merge join and fetch everything from foreign table, which takes forever.                                                                                                                                                                                                                                                                                                                                                                                                                                       
tpch=#                                                                                                                                                                                                                                                                                                                                                                                                                                 
Explain analyze select count(*) from( select o_orderkey from t_orders where o_custkey=28547 ) AS T, t_lineitem as l where T.o_orderkey=l.l_orderkey;                                                                                                                                                                                                                                                                                   
                                                                   QUERY PLAN                                                                                                                                                                                                                                                                                                                                                          
------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                       
 Aggregate  (cost=223.15..223.16 rows=1 width=0) (actual time=255.385..255.385 rows=1 loops=1)                                                                                                                                                                                                                                                                                                                                         
   ->  Nested Loop  (cost=4.99..222.98 rows=68 width=0) (actual time=48.397..255.356 rows=31 loops=1)                                                                                                                                                                                                                                                                                                                                  
         ->  Bitmap Heap Scan on t_orders  (cost=4.56..71.47 rows=17 width=4) (actual time=24.567..54.906 rows=7 loops=1)                                                                                                                                                                                                                                                                                                              
               Recheck Cond: (o_custkey = 28547)                                                                                                                                                                                                                                                                                                                                                                                       
               ->  Bitmap Index Scan on fk_t_orders  (cost=0.00..4.56 rows=17 width=0) (actual time=24.551..24.551 rows=7 loops=1)                                                                                                                                                                                                                                                                                                     
                     Index Cond: (o_custkey = 28547)                                                                                                                                                                                                                                                                                                                                                                                   
         ->  Index Only Scan using pk_t_lineitem on t_lineitem l  (cost=0.43..8.75 rows=16 width=4) (actual time=28.618..28.624 rows=4 loops=7)                                                                                                                                                                                                                                                                                        
               Index Cond: (l_orderkey = t_orders.o_orderkey)                                                                                                                                                                                                                                                                                                                                                                          
               Heap Fetches: 31                                                                                                                                                                                                                                                                                                                                                                                                        
 Total runtime: 255.489 ms                                                                                                                                                                                                                                                                                                                                                                                                             
(10 rows)                                                                                                                                                                                                                                                                                                                                                                                                                              
So, the question is, is there any way we can push join predicate into inner table ( we can disable merge join and hash join to get NL Join, but join predicate is not able to push into inner table )?
Thanks                                                                                                                                                                                                                                                                                                                                                                                                             

В списке pgsql-performance по дате отправления:

Предыдущее
От: Henrik Thostrup Jensen
Дата:
Сообщение: Re: Gist indexing performance with cidr types
Следующее
От: Tom Lane
Дата:
Сообщение: Re: is there any way we can push join predicate into inner table