PSQL does not remove obvious useless joins

Поиск
Список
Период
Сортировка
От Sfiligoi, Igor
Тема PSQL does not remove obvious useless joins
Дата
Msg-id 11675d0d21d14c2898d79736eb07279e@ASGEXCPWP06.ga.com
обсуждение исходный текст
Ответы Re: PSQL does not remove obvious useless joins  (Merlin Moncure <mmoncure@gmail.com>)
Re: PSQL does not remove obvious useless joins  (Kevin Grittner <kgrittn@gmail.com>)
Список pgsql-general

Hello.

 

We have a view that is very generic, and we noticed that PostgreSQL is not very good at removing useless joins, which makes our queries very slow.

We could change our code to avoid the view and write ad-hoc queries to the underlying tables, but would prefer not to, if there is a way around it.

(BTW: We are currently using psql 9.4)

 

Here is a simplified implementation:

# create table a (id int primary key, name varchar(128));

# create table b (id int primary key, name varchar(128));

# create table c (id int primary key, a_id int references a(id), b1_id int references b(id), b2_id int references b(id), b3_id int references b(id));

# create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from c,  a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and c.b3_id=b3.id;

 

When I try to get just info from tables c and b1:

# select id, b1_name from v

it still does all the joins (see below).

 

I would expect just one join (due to the request of columns from the two tables),

since all joins are on foreign constrains referencing primary keys,

there are no filters on the other tables,
so it is guaranteed that the useless joins will always return exactly one answer.

 

Is there a way to tweak the PostgreSQL optimizer to do the proper join removal during the planning?

Perhaps tweaking somehow either our schema or our queries (while still keeping a generic view)?

 

Thank you,

  Igor Sfiligoi

 

 

 

# explain select id, b1_name from v;

                                   QUERY PLAN                                  

--------------------------------------------------------------------------------

Nested Loop  (cost=1.02..5.45 rows=1 width=6)

   Join Filter: (c.b3_id = b3.id)

   ->  Nested Loop  (cost=1.02..4.32 rows=1 width=10)

         Join Filter: (c.a_id = a.id)

         ->  Nested Loop  (cost=1.02..3.25 rows=1 width=14)

               Join Filter: (c.b2_id = b2.id)

               ->  Hash Join  (cost=1.02..2.12 rows=1 width=18)

                     Hash Cond: (b1.id = c.b1_id)

                     ->  Seq Scan on b b1  (cost=0.00..1.06 rows=6 width=6)

                     ->  Hash  (cost=1.01..1.01 rows=1 width=20)

                           ->  Seq Scan on c  (cost=0.00..1.01 rows=1 width=20)

               ->  Seq Scan on b b2  (cost=0.00..1.06 rows=6 width=4)

         ->  Seq Scan on a  (cost=0.00..1.03 rows=3 width=4)

   ->  Seq Scan on b b3  (cost=0.00..1.06 rows=6 width=4)

(14 rows)

 

PS: The tables were very small in this example, but are quite big in the production environment.

 

В списке pgsql-general по дате отправления:

Предыдущее
От: Chris Lewis
Дата:
Сообщение: Re: Log archiving failing. Seems to be wrong timeline
Следующее
От: shankha
Дата:
Сообщение: Re: Update multiple rows in a table with different values