Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

Поиск
Список
Период
Сортировка
От Adam Zegelin
Тема Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins
Дата
Msg-id F3D61D18-DFC7-44ED-A370-92A8730FBAD3@relational.io
обсуждение исходный текст
Ответ на Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins
Список pgsql-general
Tom,

Thank you for your prompt reply. Your advice has pointed me in the right direction.

I now have the wrapper identifying columns that are inputs to the web service, and thus parameterisable. The
ec_classes,left_join_clauses and right_join_clauses trees are scanned for Var exprs that match these attributes. If
theyare present, the relid is added to the required list of outer rels for the path -- this is done as an extension to
thelogic I posted previously. 

In all cases this seems to work, except one. A join between 3 tables. The foreign table has 2 parameterised columns,
eachgiven a restriction based on one of the other two tables: 

    adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a;
                                                 QUERY PLAN
    ----------------------------------------------------------------------------------------------------
     Merge Join  (cost=5000704.96..5001278.44 rows=37822 width=168)
       Merge Cond: (l2.a = foreign1.b)
       ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
             Sort Key: l2.a
             ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Sort  (cost=5000619.54..5000634.91 rows=6150 width=132)
             Sort Key: foreign1.b
             ->  Merge Join  (cost=5000135.26..5000232.51 rows=6150 width=132)
                   Merge Cond: (foreign1.a = l1.a)
                   ->  Sort  (cost=5000049.83..5000052.33 rows=1000 width=96)
                         Sort Key: foreign1.a
                         ->  Foreign Scan on foreign1  (cost=5000000.00..5000000.00 rows=1000 width=96)
                   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
                         Sort Key: l1.a
                         ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 width=36)

My path generation logic seems to work:

baserel->cheapest_parameterized_paths = (
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 5000000.00
   :total_cost 5000000.00
   :pathkeys <>
   :fdw_private <>
   }
)

Yet the planner picks the non-parameterised path:

ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 5000000.00
   :total_cost 5000000.00
   :pathkeys <>
   :fdw_private <>
   }

I’ve tried adjusting planner tuneables to disable all join types except nested loop, and setting `join_collapse_limit`
to1 with no desirable outcome. 

Yet, adding a restriction clause between the other two tables forces them to be scanned first:

    adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a and l1.b > l2.b;
                                   QUERY PLAN
    -------------------------------------------------------------------------
     Nested Loop  (cost=0.00..2544241.17 rows=12608 width=168)
       ->  Nested Loop  (cost=0.00..22741.17 rows=504300 width=72)
             Join Filter: (l1.b > l2.b)
             ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 width=36)
             ->  Materialize  (cost=0.00..28.45 rows=1230 width=36)
                   ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Foreign Scan on foreign1  (cost=0.00..0.00 rows=500 width=96)
             Filter: ((a = l1.a) AND (b = l2.a))


ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }


On 18/03/2013, at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Adam Zegelin <adam@relational.io> writes:
>> Some service endpoints have no concept of unqualified queries. In the example above, a ‘sequence scan’ of Bing is a
notpossible. 
>
> In that case, you shouldn't be generating such a path.  But keep in mind
> that this may lead to failure to produce any plan at all for some
> queries.  If the foreign data source is really so broken that it can't
> do that, then you have little choice ... but you shouldn't be thinking
> of that as anything but a broken design decision on their part.

I tried adding a condition that would prevent the non-parameterised path from being generated if the service only
supportedparameterised scans. Postgres refuses to generate a plan: "ERROR:  could not devise a query plan for the given
query".I did a bit of digging and this error is generated by pathnode.c:set_cheapest . As there is no non-parameterised
`cheapest_total_path`the error is raised (line 253). 

For now, I just add an expensive non-pramerterised path and let the FDW throw an error if no qual is found involving
therequired columns. 

Regards,
Adam

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Christophe Pettus
Дата:
Сообщение: .backup file documentation
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Roadmap for Postgres on AIX