Обсуждение: Throwing unnecessary joins away

Поиск
Список
Период
Сортировка

Throwing unnecessary joins away

От
Ottó Havasvölgyi
Дата:
Hi all,
 
Is PostgreSQL able to throw unnecessary joins?
For example I have two tables, and I join then with their primary keys, say type of bigint . In this case if I don't reference to one of the tables anywhere except the join condition, then the join can be eliminated.
Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N : 1 relationship), and I don't reference table2 anywhere else, then it is unnecessary.
Primary key - primary key joins are often generated by O/R mappers. These generated queries could be optimized even more by not joining if not necessary.
 
You may say that I should not write such queries. The truth is that the O/R mapper is generating queries on views, and it does not use every field every time, but even so the query of the view is executed with the same plan by PostgreSQL, although some joins are unnecessary.
 
So basically this all is relevant only with views.
 
Best Regards,
Otto

Re: Throwing unnecessary joins away

От
Alessandro Baretta
Дата:
Ottó Havasvölgyi wrote:
> Hi all,
>
> Is PostgreSQL able to throw unnecessary joins?
> For example I have two tables, and I join then with their primary keys,
> say type of bigint . In this case if I don't reference to one of the
> tables anywhere except the join condition, then the join can be eliminated.
> Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N :
> 1 relationship), and I don't reference table2 anywhere else, then it is
> unnecessary.

It cannot possibly remove "unnecessary joins", simply because the join
influences whether a tuple in the referenced table gets selected and how many times.

Alex


--
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>

Re: Throwing unnecessary joins away

От
Ottó Havasvölgyi
Дата:
Hi,
As far as I know SQL Server has some similar feature. It does not join
if not necessary, more exactly: if the result would be the same if it
joined the table.
Here is another example:
http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0902/en/html/dbugen9/00000468.htm
This would be a fantastic feature.
Best Regards,
Otto


2006/1/12, Alessandro Baretta <a.baretta@barettadeit.com>:
> Ottó Havasvölgyi wrote:
> > Hi all,
> >
> > Is PostgreSQL able to throw unnecessary joins?
> > For example I have two tables, and I join then with their primary keys,
> > say type of bigint . In this case if I don't reference to one of the
> > tables anywhere except the join condition, then the join can be eliminated.
> > Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N :
> > 1 relationship), and I don't reference table2 anywhere else, then it is
> > unnecessary.
>
> It cannot possibly remove "unnecessary joins", simply because the join
> influences whether a tuple in the referenced table gets selected and how many times.
>
> Alex
>
>
> --
> *********************************************************************
> http://www.barettadeit.com/
> Baretta DE&IT
> A division of Baretta SRL
>
> tel. +39 02 370 111 55
> fax. +39 02 370 111 54
>
> Our technology:
>
> The Application System/Xcaml (AS/Xcaml)
> <http://www.asxcaml.org/>
>
> The FreerP Project
> <http://www.freerp.org/>
>

Re: Throwing unnecessary joins away

От
Tom Lane
Дата:
=?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <havasvolgyi.otto@gmail.com> writes:
> As far as I know SQL Server has some similar feature. It does not join
> if not necessary, more exactly: if the result would be the same if it
> joined the table.

I find it really really hard to believe that such cases arise often
enough to justify having the planner spend cycles checking for them.

            regards, tom lane

Re: Throwing unnecessary joins away

От
Szűcs Gábor
Дата:
Dear Tom,

Not sure about Otto's exact problem, but he did mention views, and I'd feel
more comfortable if you told me that view-based queries are re-planned based
on actual conditions etc. Are they?

Also, if you find it unlikely (or very rare) then it might be a configurable
parameter. If someone finds it drastically improving (some of) their
queries, it'd be possible to enable this feature in expense of extra planner
cycles (on all queries).

What I'd be concerned about, is whether the developers' time spent on this
feature would worth it. :)

--
G.


On 2006.01.12. 16:53, Tom Lane wrote:
> =?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <havasvolgyi.otto@gmail.com> writes:
>> As far as I know SQL Server has some similar feature. It does not join
>> if not necessary, more exactly: if the result would be the same if it
>> joined the table.
>
> I find it really really hard to believe that such cases arise often
> enough to justify having the planner spend cycles checking for them.
>
>             regards, tom lane


Re: Throwing unnecessary joins away

От
Ottó Havasvölgyi
Дата:
Hi,

I think it would be sufficient only for views. In other cases the
programmer can optimize himself. But a view can be a join of other
tables, and it is not sure that all of them are always needed. It all
depends on what I select from the view.
This information could even be calculted at view creation time. Of
cource this requires that views are handled in a bit more special way,
not just a view definition that is substituted into the original query
(as far as I know the current implementation is similar to this. Sorry
if not).
What do you think about this idea? Of course it is not trivial to
implement, but the result is really great.

Postgres could determine at creation time, if this kind of
optimization is possible at all or not. It can happan though that not
all information is available (I mean unique index or foreign key) at
that time. So this optimiztaion info could be refreshed later by a
command, "ALTER VIEW <viewname> ANALYZE" or "ANALYZE <view name>"
simply.
Postgres could also establish at creation time that for a given column
in the selection list which source table(s) are required. This is
probably not sufficient, but I haven't thought is through thouroughly
yet. And I am not that familiar with the current optimizer internals.
And one should be able to turn off this optimization, so that view
creation takes not longer than now. If the optimizer finds no
optimization info in the catalog, it behaves like now.
I hope you see this worth.
This all is analogue to statistics collection.

Thanks for reading,
Otto


2006/1/12, Tom Lane <tgl@sss.pgh.pa.us>:
> =?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <havasvolgyi.otto@gmail.com> writes:
> > As far as I know SQL Server has some similar feature. It does not join
> > if not necessary, more exactly: if the result would be the same if it
> > joined the table.
>
> I find it really really hard to believe that such cases arise often
> enough to justify having the planner spend cycles checking for them.
>
>                        regards, tom lane
>

Re: Throwing unnecessary joins away

От
Scott Marlowe
Дата:
On Thu, 2006-01-12 at 11:00, Ottó Havasvölgyi wrote:
> Hi,
>
> I think it would be sufficient only for views. In other cases the
> programmer can optimize himself. But a view can be a join of other
> tables, and it is not sure that all of them are always needed. It all
> depends on what I select from the view.

The idea that you could throw away joins only works for outer joins.
I.e. if you did:

select a.x, a.y, a.z from a left join b (on a.id=b.aid)

then you could throw away the join to b.  But if it was a regular inner
join then you couldn't know whether or not you needed to join to b
without actually joining to b...

Re: Throwing unnecessary joins away

От
Ottó Havasvölgyi
Дата:
Hi,

If the join is to a primary key or notnull unique column(s), then
inner join is also ok. But of course left join is the simpler case.
An example:

create table person (id serial primary key, name varchar not null);
create table pet (id serial primary key, name varchar not null,
person_id int not null references person(id));
create view v_pet_person as select pet.id as pet_id, pet.name as
pet_name, person_id as person_id, person.name as person_name from pet
join person (pet.person_id=person.id);

At this point we know that optimization may be possible because of the
primary key on person. The optimization depends on the primary key
constraint. Kindof internal dependency.
We can find out that which "from-element" is a given field's source as
far they are simple references. This can be stored.
Then query the view:

select pet_name, person_id from v_pet_person where person_id=2;

In this case we don't need the join.
These queries are usually dynamically generated, the selection list
and the where condition is the dynamic part.

Best Regards,
Otto


2006/1/12, Scott Marlowe <smarlowe@g2switchworks.com>:
> On Thu, 2006-01-12 at 11:00, Ottó Havasvölgyi wrote:
> > Hi,
> >
> > I think it would be sufficient only for views. In other cases the
> > programmer can optimize himself. But a view can be a join of other
> > tables, and it is not sure that all of them are always needed. It all
> > depends on what I select from the view.
>
> The idea that you could throw away joins only works for outer joins.
> I.e. if you did:
>
> select a.x, a.y, a.z from a left join b (on a.id=b.aid)
>
> then you could throw away the join to b.  But if it was a regular inner
> join then you couldn't know whether or not you needed to join to b
> without actually joining to b...
>

Re: Throwing unnecessary joins away

От
"Jim C. Nasby"
Дата:
On Thu, Jan 12, 2006 at 01:35:07PM +0100, Alessandro Baretta wrote:
> Ott? Havasv?lgyi wrote:
> >Hi all,
> >
> >Is PostgreSQL able to throw unnecessary joins?
> >For example I have two tables, and I join then with their primary keys,
> >say type of bigint . In this case if I don't reference to one of the
> >tables anywhere except the join condition, then the join can be eliminated.
> >Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N :
> >1 relationship), and I don't reference table2 anywhere else, then it is
> >unnecessary.
>
> It cannot possibly remove "unnecessary joins", simply because the join
> influences whether a tuple in the referenced table gets selected and how
> many times.

It can remove them if it's an appropriate outer join, or if there is
appropriate RI that proves that the join won't change what data is
selected.

A really common example of this is creating views that pull in tables
that have text names to go with id's, ie:

CREATE TABLE bug_status(
    bug_status_id       serial  PRIMARY KEY
    , bug_status_name   text    NOT NULL UNIQUE
);

CREATE TABLE bug(
    ...
    , bug_status_id     int     REFERENCES bug_status(bug_status_id)
);

CREATE VIEW bug_v AS
    SELECT b.*, bs.bug_status_name FROM bug b JOIN bug_status NATURAL
;

If you have a bunch of cases like that and start building views on views
it's very easy to end up in situations where you don't have any need of
bug_status_name at all. And because of the RI, you know that removing
the join can't possibly change the bug.* portion of that view.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Throwing unnecessary joins away

От
"Jim C. Nasby"
Дата:
On Thu, Jan 12, 2006 at 07:51:22PM +0100, Ott? Havasv?lgyi wrote:
> Hi,
>
> If the join is to a primary key or notnull unique column(s), then
> inner join is also ok. But of course left join is the simpler case.
> An example:

Actually, you need both the unique/pk constraint, and RI (a fact I
missed in the email I just sent). Nullability is another consideration
as well. But there certainly are some pretty common cases that can be
optimized for.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461