Обсуждение: JOIN vs. LEFT JOIN

Поиск
Список
Период
Сортировка

JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
Hi !
 
I heard that it is always better to use a full JOIN rather than a LEFT JOIN, for performance reasons.  Is that correct ?
But it's barely possible to use full JOINS all the time, since most of the lookup fields are not required.
Example : in the customer table, the customer type is a looup field to a customer_type table.  But that is not required to post the record.  So I was thinking to create a record in the customer_type table with ID = -1.  And in case the customer type is NULL, to assign it the value -1.  That way, it will be always possible to do a full JOIN.  I was wondering if that is good practice or rather nonsense ?
 
Many thanks in advance,
Nico

Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nico Callewaert schrieb:
> Hi !
>
> I heard that it is always better to use a full JOIN rather than a LEFT
> JOIN, for performance reasons.  Is that correct ?

please note that a JOIN and a LEFT JOIN are tow different things. Let's assume you have
two tables (without any sens):

books:

 headline |   content
- ----------+-------------
 politics | russia
 politics | middle east
 politics | germany

interests:

 content
- ---------
 germany

Then fire two select statements like this:

#1:
SELECT a.headline,a.content,b.content as contentb
FROM books a
JOIN interests b ON a.content = b.content;

  headline | content | contentb
- ----------+---------+----------
 politics | germany | germany
(1 row)

#2:
SELECT a.headline,a.content,b.content as contentb
FROM books a
LEFT JOIN interests b on a.content = b.content;
 headline |   content   | contentb
- ----------+-------------+----------
 politics | russia      |
 politics | middle east |
 politics | germany     | germany
(3 rows)

> But it's barely possible to use full JOINS all the time, since most of
> the lookup fields are not required.
> Example : in the customer table, the customer type is a looup field to a
> customer_type table.  But that is not required to post the record.  So I
> was thinking to create a record in the customer_type table with ID =
> -1.  And in case the customer type is NULL, to assign it the value -1.
> That way, it will be always possible to do a full JOIN.  I was wondering
> if that is good practice or rather nonsense ?

Hm concerning the functionality of LEFT JOIN I do not really understand what you wanna do
here. You created the customer_type table to have the possibility to give more types to
one customer (1:n). But therefore you do not need a LEFT JOIN. The statement could be like:

SELECT name a
FROM customer a, cutomer_type b
WHERE a.id = b.id
AND b.type = 'super customer'

Or not?

Cheers

Andy
- --
St.Pauli - Hamburg - Germany

Andreas Wenk
> Many thanks in advance,
> Nico
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgD/9Va7znmSP9AwRAsyDAKC8utO2Agy0ONULuy7nIgz9pG/7rgCfa/li
o98EaJSKGqkv2brcd0RcI04=
=de2X
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Nico Callewaert schrieb:
>> Hi !
>>
>> I heard that it is always better to use a full JOIN rather than a LEFT
>> JOIN, for performance reasons.  Is that correct ?
>
> please note that a JOIN and a LEFT JOIN are tow different things. Let's
> assume you have
> two tables (without any sens):
>
> books:
>
> headline |   content
> - ----------+-------------
> politics | russia
> politics | middle east
> politics | germany
>
> interests:
>
> content
> - ---------
> germany
>
> Then fire two select statements like this:
>
> #1:
> SELECT a.headline,a.content,b.content as contentb
> FROM books a
> JOIN interests b ON a.content = b.content;
>
>  headline | content | contentb
> - ----------+---------+----------
> politics | germany | germany
> (1 row)
>
> #2:
> SELECT a.headline,a.content,b.content as contentb
> FROM books a
> LEFT JOIN interests b on a.content = b.content;
> headline |   content   | contentb
> - ----------+-------------+----------
> politics | russia      |
> politics | middle east |
> politics | germany     | germany
> (3 rows)
>
>> But it's barely possible to use full JOINS all the time, since most of
>> the lookup fields are not required.
>> Example : in the customer table, the customer type is a looup field to a
>> customer_type table.  But that is not required to post the record.  So I
>> was thinking to create a record in the customer_type table with ID =
>> -1.  And in case the customer type is NULL, to assign it the value -1.
>> That way, it will be always possible to do a full JOIN.  I was wondering
>> if that is good practice or rather nonsense ?
>
> Hm concerning the functionality of LEFT JOIN I do not really understand
> what you wanna do
> here. You created the customer_type table to have the possibility to give
> more types to
> one customer (1:n). But therefore you do not need a LEFT JOIN. The
> statement could be like:
>
> SELECT name a
> FROM customer a, cutomer_type b
> WHERE a.id = b.id
> AND b.type = 'super customer'
>
> Or not?
>
> Cheers
>
> Andy


Hi Andreas,

Thanks for the reply.  I know the difference between JOIN and LEFT JOIN.
The question was just if there is a performance hit between the 2 ?

I never use constructs like this : SELECT name a
> FROM customer a, cutomer_type b
> WHERE a.id = b.id
> AND b.type = 'super customer'

Always Joins.
I will try to clarify.  The current situation is : the customer_type has no
value, so = NULL, no problem with LEFT JOIN.
But I'm wondering with tables that has thousands of records, if LEFT JOINS
are performing well ?

Thanks again, Nico



Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nico Callewaert schrieb:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Nico Callewaert schrieb:
>>> Hi !
>>>
>>> I heard that it is always better to use a full JOIN rather than a LEFT
>>> JOIN, for performance reasons.  Is that correct ?
>>
>> please note that a JOIN and a LEFT JOIN are tow different things.
>> Let's assume you have
>> two tables (without any sens):
>>
>> books:
>>
>> headline |   content
>> - ----------+-------------
>> politics | russia
>> politics | middle east
>> politics | germany
>>
>> interests:
>>
>> content
>> - ---------
>> germany
>>
>> Then fire two select statements like this:
>>
>> #1:
>> SELECT a.headline,a.content,b.content as contentb
>> FROM books a
>> JOIN interests b ON a.content = b.content;
>>
>>  headline | content | contentb
>> - ----------+---------+----------
>> politics | germany | germany
>> (1 row)
>>
>> #2:
>> SELECT a.headline,a.content,b.content as contentb
>> FROM books a
>> LEFT JOIN interests b on a.content = b.content;
>> headline |   content   | contentb
>> - ----------+-------------+----------
>> politics | russia      |
>> politics | middle east |
>> politics | germany     | germany
>> (3 rows)
>>
>>> But it's barely possible to use full JOINS all the time, since most of
>>> the lookup fields are not required.
>>> Example : in the customer table, the customer type is a looup field to a
>>> customer_type table.  But that is not required to post the record.  So I
>>> was thinking to create a record in the customer_type table with ID =
>>> -1.  And in case the customer type is NULL, to assign it the value -1.
>>> That way, it will be always possible to do a full JOIN.  I was wondering
>>> if that is good practice or rather nonsense ?
>>
>> Hm concerning the functionality of LEFT JOIN I do not really
>> understand what you wanna do
>> here. You created the customer_type table to have the possibility to
>> give more types to
>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>> statement could be like:
>>
>> SELECT name a
>> FROM customer a, cutomer_type b
>> WHERE a.id = b.id
>> AND b.type = 'super customer'
>>
>> Or not?
>>
>> Cheers
>>
>> Andy
>
>
> Hi Andreas,
>
> Thanks for the reply.  I know the difference between JOIN and LEFT JOIN.
> The question was just if there is a performance hit between the 2 ?
>
> I never use constructs like this : SELECT name a
>> FROM customer a, cutomer_type b
>> WHERE a.id = b.id
>> AND b.type = 'super customer'
>
> Always Joins.
> I will try to clarify.  The current situation is : the customer_type has
> no value, so = NULL, no problem with LEFT JOIN.

Why do you need an empty entry in this table? Your application should take care, that this
does not happen ...

> But I'm wondering with tables that has thousands of records, if LEFT
> JOINS are performing well ?

EXPLAIN ANALYZE with my examples:

#1
test=# explain analyze select a.headline,a.content,b.content as contentb from books a join
interests b on a.content = b.content;
                                                    QUERY PLAN

-
------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=10.88..23.67 rows=70 width=1548) (actual time=0.051..0.058 rows=1 loops=1)
   Hash Cond: (("outer".content)::text = ("inner".content)::text)
   ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516) (actual
time=0.007..0.010 rows=1 loops=1)
   ->  Hash  (cost=10.70..10.70 rows=70 width=1032) (actual time=0.026..0.026 rows=3 loops=1)
         ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032) (actual
time=0.004..0.013 rows=3 loops=1)
 Total runtime: 0.094 ms
(6 rows)


#2
test=# explain analyze select a.headline,a.content,b.content as contentb  from books a
left join interests b on a.content = b.content;
                                                      QUERY PLAN

-
----------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=11.75..23.72 rows=70 width=1548) (actual time=0.038..0.062 rows=3
loops=1)
   Hash Cond: (("outer".content)::text = ("inner".content)::text)
   ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032) (actual
time=0.007..0.014 rows=3 loops=1)
   ->  Hash  (cost=11.40..11.40 rows=140 width=516) (actual time=0.015..0.015 rows=1 loops=1)
         ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516) (actual
time=0.004..0.007 rows=1 loops=1)
 Total runtime: 0.102 ms
(6 rows)

I fired each statement ten times. But seriously - this is more or less just a hint that a
LEFT JOIN is less performant tahn a JOIN. To get that exactly it depend on which execution
plan the planner is using. So the best way ist to use EXPALIN ANALYZE with your statements.

Sorry for the big post with much output ...

Cheers

Andy

> Thanks again, Nico
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgEWwVa7znmSP9AwRAi6NAKDdU4+b+CRzEUg8WsfN2M2P9CVl0wCg3iJn
14HzK+4/IHfdcVfqY4Jbf1g=
=pNKm
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
>>> Nico Callewaert schrieb:
>>>> Hi !
>>>>
>>>> I heard that it is always better to use a full JOIN rather than a LEFT
>>>> JOIN, for performance reasons.  Is that correct ?
>>>
>>> please note that a JOIN and a LEFT JOIN are tow different things.
>>> Let's assume you have
>>> two tables (without any sens):
>>>
>>> books:
>>>
>>> headline |   content
>>> - ----------+-------------
>>> politics | russia
>>> politics | middle east
>>> politics | germany
>>>
>>> interests:
>>>
>>> content
>>> - ---------
>>> germany
>>>
>>> Then fire two select statements like this:
>>>
>>> #1:
>>> SELECT a.headline,a.content,b.content as contentb
>>> FROM books a
>>> JOIN interests b ON a.content = b.content;
>>>
>>>  headline | content | contentb
>>> - ----------+---------+----------
>>> politics | germany | germany
>>> (1 row)
>>>
>>> #2:
>>> SELECT a.headline,a.content,b.content as contentb
>>> FROM books a
>>> LEFT JOIN interests b on a.content = b.content;
>>> headline |   content   | contentb
>>> - ----------+-------------+----------
>>> politics | russia      |
>>> politics | middle east |
>>> politics | germany     | germany
>>> (3 rows)
>>>
>>>> But it's barely possible to use full JOINS all the time, since most of
>>>> the lookup fields are not required.
>>>> Example : in the customer table, the customer type is a looup field to
>>>> a
>>>> customer_type table.  But that is not required to post the record.  So
>>>> I
>>>> was thinking to create a record in the customer_type table with ID =
>>>> -1.  And in case the customer type is NULL, to assign it the value -1.
>>>> That way, it will be always possible to do a full JOIN.  I was
>>>> wondering
>>>> if that is good practice or rather nonsense ?
>>>
>>> Hm concerning the functionality of LEFT JOIN I do not really
>>> understand what you wanna do
>>> here. You created the customer_type table to have the possibility to
>>> give more types to
>>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>>> statement could be like:
>>>
>>> SELECT name a
>>> FROM customer a, cutomer_type b
>>> WHERE a.id = b.id
>>> AND b.type = 'super customer'
>>>
>>> Or not?
>>>
>>> Cheers
>>>
>>> Andy
>>
>>
>> Hi Andreas,
>>
>> Thanks for the reply.  I know the difference between JOIN and LEFT JOIN.
>> The question was just if there is a performance hit between the 2 ?
>>
>> I never use constructs like this : SELECT name a
>>> FROM customer a, cutomer_type b
>>> WHERE a.id = b.id
>>> AND b.type = 'super customer'
>>
>> Always Joins.
>> I will try to clarify.  The current situation is : the customer_type has
>> no value, so = NULL, no problem with LEFT JOIN.
>
> Why do you need an empty entry in this table? Your application should take
> care, that this
> does not happen ...
>
>> But I'm wondering with tables that has thousands of records, if LEFT
>> JOINS are performing well ?
>
> EXPLAIN ANALYZE with my examples:
>
> #1
> test=# explain analyze select a.headline,a.content,b.content as contentb
> from books a join
> interests b on a.content = b.content;
>                                                    QUERY PLAN
>
> -
> ------------------------------------------------------------------------------------------------------------------
> Hash Join  (cost=10.88..23.67 rows=70 width=1548) (actual
> time=0.051..0.058 rows=1 loops=1)
>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>   ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516)
> (actual
> time=0.007..0.010 rows=1 loops=1)
>   ->  Hash  (cost=10.70..10.70 rows=70 width=1032) (actual
> time=0.026..0.026 rows=3 loops=1)
>         ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032)
> (actual
> time=0.004..0.013 rows=3 loops=1)
> Total runtime: 0.094 ms
> (6 rows)
>
>
> #2
> test=# explain analyze select a.headline,a.content,b.content as contentb
> from books a
> left join interests b on a.content = b.content;
>                                                      QUERY PLAN
>
> -
>
----------------------------------------------------------------------------------------------------------------------
> Hash Left Join  (cost=11.75..23.72 rows=70 width=1548) (actual
> time=0.038..0.062 rows=3
> loops=1)
>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>   ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032) (actual
> time=0.007..0.014 rows=3 loops=1)
>   ->  Hash  (cost=11.40..11.40 rows=140 width=516) (actual
> time=0.015..0.015 rows=1 loops=1)
>         ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516)
> (actual
> time=0.004..0.007 rows=1 loops=1)
> Total runtime: 0.102 ms
> (6 rows)
>
> I fired each statement ten times. But seriously - this is more or less
> just a hint that a
> LEFT JOIN is less performant tahn a JOIN. To get that exactly it depend on
> which execution
> plan the planner is using. So the best way ist to use EXPALIN ANALYZE with
> your statements.
>
> Sorry for the big post with much output ...
>
> Cheers
>
> Andy
>


Hi Andy,

Thanks again for the fast reply !

> Why do you need an empty entry in this table? Your application should take
> care, that this
> does not happen ...

That is the whole point of my doubts.  When a user creates a new customer,
it's not an obligation to enter a customer_type, and still many other
fields, like tax, payment, yellow pages, payment method, etc... related
things.  So I always have lots of LEFT JOIN's in my queries, because of the
NULL fields.  You said the application should handle it.  So you mean, even
the field is not obligatory, you would put a value on it ?  All lookup
tables are of course linked through foreign keys to enforce data integrity.
So what value would you enter ?  Like 0, or -1 ??  And make sure there is a
0, or -1 record in the lookup table ?  Otherwise the foreign key will
complain.
I think I almost get your point, just need a last little push :-)

Thanks a lot !
Nico



Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nico Callewaert schrieb:
>>>> Nico Callewaert schrieb:
>>>>> Hi !
>>>>>
>>>>> I heard that it is always better to use a full JOIN rather than a LEFT
>>>>> JOIN, for performance reasons.  Is that correct ?
>>>>
>>>> please note that a JOIN and a LEFT JOIN are tow different things.
>>>> Let's assume you have
>>>> two tables (without any sens):
>>>>
>>>> books:
>>>>
>>>> headline |   content
>>>> - ----------+-------------
>>>> politics | russia
>>>> politics | middle east
>>>> politics | germany
>>>>
>>>> interests:
>>>>
>>>> content
>>>> - ---------
>>>> germany
>>>>
>>>> Then fire two select statements like this:
>>>>
>>>> #1:
>>>> SELECT a.headline,a.content,b.content as contentb
>>>> FROM books a
>>>> JOIN interests b ON a.content = b.content;
>>>>
>>>>  headline | content | contentb
>>>> - ----------+---------+----------
>>>> politics | germany | germany
>>>> (1 row)
>>>>
>>>> #2:
>>>> SELECT a.headline,a.content,b.content as contentb
>>>> FROM books a
>>>> LEFT JOIN interests b on a.content = b.content;
>>>> headline |   content   | contentb
>>>> - ----------+-------------+----------
>>>> politics | russia      |
>>>> politics | middle east |
>>>> politics | germany     | germany
>>>> (3 rows)
>>>>
>>>>> But it's barely possible to use full JOINS all the time, since most of
>>>>> the lookup fields are not required.
>>>>> Example : in the customer table, the customer type is a looup field
>>>>> to a
>>>>> customer_type table.  But that is not required to post the record.
>>>>> So I
>>>>> was thinking to create a record in the customer_type table with ID =
>>>>> -1.  And in case the customer type is NULL, to assign it the value -1.
>>>>> That way, it will be always possible to do a full JOIN.  I was
>>>>> wondering
>>>>> if that is good practice or rather nonsense ?
>>>>
>>>> Hm concerning the functionality of LEFT JOIN I do not really
>>>> understand what you wanna do
>>>> here. You created the customer_type table to have the possibility to
>>>> give more types to
>>>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>>>> statement could be like:
>>>>
>>>> SELECT name a
>>>> FROM customer a, cutomer_type b
>>>> WHERE a.id = b.id
>>>> AND b.type = 'super customer'
>>>>
>>>> Or not?
>>>>
>>>> Cheers
>>>>
>>>> Andy
>>>
>>>
>>> Hi Andreas,
>>>
>>> Thanks for the reply.  I know the difference between JOIN and LEFT JOIN.
>>> The question was just if there is a performance hit between the 2 ?
>>>
>>> I never use constructs like this : SELECT name a
>>>> FROM customer a, cutomer_type b
>>>> WHERE a.id = b.id
>>>> AND b.type = 'super customer'
>>>
>>> Always Joins.
>>> I will try to clarify.  The current situation is : the customer_type has
>>> no value, so = NULL, no problem with LEFT JOIN.
>>
>> Why do you need an empty entry in this table? Your application should
>> take care, that this
>> does not happen ...
>>
>>> But I'm wondering with tables that has thousands of records, if LEFT
>>> JOINS are performing well ?
>>
>> EXPLAIN ANALYZE with my examples:
>>
>> #1
>> test=# explain analyze select a.headline,a.content,b.content as
>> contentb from books a join
>> interests b on a.content = b.content;
>>                                                    QUERY PLAN
>>
>> -
>> ------------------------------------------------------------------------------------------------------------------
>>
>> Hash Join  (cost=10.88..23.67 rows=70 width=1548) (actual
>> time=0.051..0.058 rows=1 loops=1)
>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>   ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516)
>> (actual
>> time=0.007..0.010 rows=1 loops=1)
>>   ->  Hash  (cost=10.70..10.70 rows=70 width=1032) (actual
>> time=0.026..0.026 rows=3 loops=1)
>>         ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032)
>> (actual
>> time=0.004..0.013 rows=3 loops=1)
>> Total runtime: 0.094 ms
>> (6 rows)
>>
>>
>> #2
>> test=# explain analyze select a.headline,a.content,b.content as
>> contentb from books a
>> left join interests b on a.content = b.content;
>>                                                      QUERY PLAN
>>
>> -
>>
----------------------------------------------------------------------------------------------------------------------
>>
>> Hash Left Join  (cost=11.75..23.72 rows=70 width=1548) (actual
>> time=0.038..0.062 rows=3
>> loops=1)
>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>   ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032) (actual
>> time=0.007..0.014 rows=3 loops=1)
>>   ->  Hash  (cost=11.40..11.40 rows=140 width=516) (actual
>> time=0.015..0.015 rows=1 loops=1)
>>         ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140
>> width=516) (actual
>> time=0.004..0.007 rows=1 loops=1)
>> Total runtime: 0.102 ms
>> (6 rows)
>>
>> I fired each statement ten times. But seriously - this is more or less
>> just a hint that a
>> LEFT JOIN is less performant tahn a JOIN. To get that exactly it
>> depend on which execution
>> plan the planner is using. So the best way ist to use EXPALIN ANALYZE
>> with your statements.
>>
>> Sorry for the big post with much output ...
>>
>> Cheers
>>
>> Andy
>>
>
>
> Hi Andy,
>
> Thanks again for the fast reply !
>
>> Why do you need an empty entry in this table? Your application should
>> take care, that this
>> does not happen ...
>
> That is the whole point of my doubts.  When a user creates a new
> customer, it's not an obligation to enter a customer_type, and still
> many other fields, like tax, payment, yellow pages, payment method,
> etc... related things.  So I always have lots of LEFT JOIN's in my
> queries, because of the NULL fields.  You said the application should
> handle it.  So you mean, even the field is not obligatory, you would put
> a value on it ?  All lookup tables are of course linked through foreign
> keys to enforce data integrity. So what value would you enter ?  Like 0,
> or -1 ??  And make sure there is a 0, or -1 record in the lookup table
> ?  Otherwise the foreign key will complain.
> I think I almost get your point, just need a last little push :-)

Ah ok - now we are more in database/application design ;-). Here's another question: are
you really sure that the databse design is correct? Think about having all the info like
customer_type, tax, payment, yellow pages, payment method in the customer table also.
Menas - do you really need for these info more than one entry for one customer? So are you
sure you need the 1:n relation in all cases? I think not, so that would mean you put the
info about e.g tax, payment method and so on in the customer table as well and not in one
(or more ...? you were writing "all referenced tables") referenced tables. As a result you
will have less LEFT JOINS ...

Cheers

Andy

> Thanks a lot !
> Nico
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgEvGVa7znmSP9AwRAtaZAJ4tRyx5hQ+/f6a1itFJjQygShXAGwCfai7a
7NpGhS3tegajf/i+NZ6VxF8=
=Dtb9
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
>>>>> Nico Callewaert schrieb:
>>>>>> Hi !
>>>>>>
>>>>>> I heard that it is always better to use a full JOIN rather than a
>>>>>> LEFT
>>>>>> JOIN, for performance reasons.  Is that correct ?
>>>>>
>>>>> please note that a JOIN and a LEFT JOIN are tow different things.
>>>>> Let's assume you have
>>>>> two tables (without any sens):
>>>>>
>>>>> books:
>>>>>
>>>>> headline |   content
>>>>> - ----------+-------------
>>>>> politics | russia
>>>>> politics | middle east
>>>>> politics | germany
>>>>>
>>>>> interests:
>>>>>
>>>>> content
>>>>> - ---------
>>>>> germany
>>>>>
>>>>> Then fire two select statements like this:
>>>>>
>>>>> #1:
>>>>> SELECT a.headline,a.content,b.content as contentb
>>>>> FROM books a
>>>>> JOIN interests b ON a.content = b.content;
>>>>>
>>>>>  headline | content | contentb
>>>>> - ----------+---------+----------
>>>>> politics | germany | germany
>>>>> (1 row)
>>>>>
>>>>> #2:
>>>>> SELECT a.headline,a.content,b.content as contentb
>>>>> FROM books a
>>>>> LEFT JOIN interests b on a.content = b.content;
>>>>> headline |   content   | contentb
>>>>> - ----------+-------------+----------
>>>>> politics | russia      |
>>>>> politics | middle east |
>>>>> politics | germany     | germany
>>>>> (3 rows)
>>>>>
>>>>>> But it's barely possible to use full JOINS all the time, since most
>>>>>> of
>>>>>> the lookup fields are not required.
>>>>>> Example : in the customer table, the customer type is a looup field
>>>>>> to a
>>>>>> customer_type table.  But that is not required to post the record.
>>>>>> So I
>>>>>> was thinking to create a record in the customer_type table with ID =
>>>>>> -1.  And in case the customer type is NULL, to assign it the
>>>>>> value -1.
>>>>>> That way, it will be always possible to do a full JOIN.  I was
>>>>>> wondering
>>>>>> if that is good practice or rather nonsense ?
>>>>>
>>>>> Hm concerning the functionality of LEFT JOIN I do not really
>>>>> understand what you wanna do
>>>>> here. You created the customer_type table to have the possibility to
>>>>> give more types to
>>>>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>>>>> statement could be like:
>>>>>
>>>>> SELECT name a
>>>>> FROM customer a, cutomer_type b
>>>>> WHERE a.id = b.id
>>>>> AND b.type = 'super customer'
>>>>>
>>>>> Or not?
>>>>>
>>>>> Cheers
>>>>>
>>>>> Andy
>>>>
>>>>
>>>> Hi Andreas,
>>>>
>>>> Thanks for the reply.  I know the difference between JOIN and LEFT
>>>> JOIN.
>>>> The question was just if there is a performance hit between the 2 ?
>>>>
>>>> I never use constructs like this : SELECT name a
>>>>> FROM customer a, cutomer_type b
>>>>> WHERE a.id = b.id
>>>>> AND b.type = 'super customer'
>>>>
>>>> Always Joins.
>>>> I will try to clarify.  The current situation is : the customer_type
>>>> has
>>>> no value, so = NULL, no problem with LEFT JOIN.
>>>
>>> Why do you need an empty entry in this table? Your application should
>>> take care, that this
>>> does not happen ...
>>>
>>>> But I'm wondering with tables that has thousands of records, if LEFT
>>>> JOINS are performing well ?
>>>
>>> EXPLAIN ANALYZE with my examples:
>>>
>>> #1
>>> test=# explain analyze select a.headline,a.content,b.content as
>>> contentb from books a join
>>> interests b on a.content = b.content;
>>>                                                    QUERY PLAN
>>>
>>> -
>>> ------------------------------------------------------------------------------------------------------------------
>>>
>>> Hash Join  (cost=10.88..23.67 rows=70 width=1548) (actual
>>> time=0.051..0.058 rows=1 loops=1)
>>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>>   ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516)
>>> (actual
>>> time=0.007..0.010 rows=1 loops=1)
>>>   ->  Hash  (cost=10.70..10.70 rows=70 width=1032) (actual
>>> time=0.026..0.026 rows=3 loops=1)
>>>         ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032)
>>> (actual
>>> time=0.004..0.013 rows=3 loops=1)
>>> Total runtime: 0.094 ms
>>> (6 rows)
>>>
>>>
>>> #2
>>> test=# explain analyze select a.headline,a.content,b.content as
>>> contentb from books a
>>> left join interests b on a.content = b.content;
>>>                                                      QUERY PLAN
>>>
>>> -
>>>
----------------------------------------------------------------------------------------------------------------------
>>>
>>> Hash Left Join  (cost=11.75..23.72 rows=70 width=1548) (actual
>>> time=0.038..0.062 rows=3
>>> loops=1)
>>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>>   ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032) (actual
>>> time=0.007..0.014 rows=3 loops=1)
>>>   ->  Hash  (cost=11.40..11.40 rows=140 width=516) (actual
>>> time=0.015..0.015 rows=1 loops=1)
>>>         ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140
>>> width=516) (actual
>>> time=0.004..0.007 rows=1 loops=1)
>>> Total runtime: 0.102 ms
>>> (6 rows)
>>>
>>> I fired each statement ten times. But seriously - this is more or less
>>> just a hint that a
>>> LEFT JOIN is less performant tahn a JOIN. To get that exactly it
>>> depend on which execution
>>> plan the planner is using. So the best way ist to use EXPALIN ANALYZE
>>> with your statements.
>>>
>>> Sorry for the big post with much output ...
>>>
>>> Cheers
>>>
>>> Andy
>>>
>>
>>
>> Hi Andy,
>>
>> Thanks again for the fast reply !
>>
>>> Why do you need an empty entry in this table? Your application should
>>> take care, that this
>>> does not happen ...
>>
>> That is the whole point of my doubts.  When a user creates a new
>> customer, it's not an obligation to enter a customer_type, and still
>> many other fields, like tax, payment, yellow pages, payment method,
>> etc... related things.  So I always have lots of LEFT JOIN's in my
>> queries, because of the NULL fields.  You said the application should
>> handle it.  So you mean, even the field is not obligatory, you would put
>> a value on it ?  All lookup tables are of course linked through foreign
>> keys to enforce data integrity. So what value would you enter ?  Like 0,
>> or -1 ??  And make sure there is a 0, or -1 record in the lookup table
>> ?  Otherwise the foreign key will complain.
>> I think I almost get your point, just need a last little push :-)
>
> Ah ok - now we are more in database/application design ;-). Here's another
> question: are
> you really sure that the databse design is correct? Think about having all
> the info like
> customer_type, tax, payment, yellow pages, payment method in the customer
> table also.
> Menas - do you really need for these info more than one entry for one
> customer? So are you
> sure you need the 1:n relation in all cases? I think not, so that would
> mean you put the
> info about e.g tax, payment method and so on in the customer table as well
> and not in one
> (or more ...? you were writing "all referenced tables") referenced tables.
> As a result you
> will have less LEFT JOINS ...
>
> Cheers
>
> Andy
>

Hi Andy,

I don't know what are the group policies here ? In case I have to cut a part
of the message, just tell me, in case it gets too long.
And in case it gets off topic, just send me a warning :-)

Well, yes, we need all those 1:n relationships, because customer type can be
prospect, customer, not active.  Payment methods can be visa, cash, wire
transfer, etc...  Yellow pages are the category numbers.  VAT rules can be
21%, 6%, etc...  All of these fields are feeded through drop downs in the
application, so the customer can easily select a value.

Nico



Re: JOIN vs. LEFT JOIN

От
"Obe, Regina"
Дата:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Nico Callewaert schrieb:
>> Hi !
>>
>> I heard that it is always better to use a full JOIN rather than a
LEFT
>> JOIN, for performance reasons.  Is that correct ?
>
> please note that a JOIN and a LEFT JOIN are tow different things.
Let's
> assume you have
> two tables (without any sens):
>

This is a bit of the nitpick, but please don't call an INNER JOIN a FULL
JOIN.

Those are 2 separate animals altogether.

JOIN and INNER JOIN are the same, but I tend to put in the word INNER
for clarity
even though some may consider it redundant.

So in terms of performance

[INNER] JOIN -- fastest
LEFT JOIN -- generally slower (but there really is no alternative if you
don't want to leave out records  without matches
            and the performance hit is not worth the effort
of changing your data model to put in junk data that
            should be rightfully NULL, plus there are other
nifty tricks you can perform with LEFTS as I have here)

http://www.bostongis.com/blog/index.php?/archives/37-Explain-Analyze-Geo
metry-Relation-Operators-and-Joins-Except-Where.html

FULL JOIN -- rarely used and slower than INNER, LEFT, RIGHT, but it
comes in handy at times.

Okay enough of my preaching.

Go forth and prosper.

Thanks,
Regina
-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.

Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nico Callewaert schrieb:
>>>>>> Nico Callewaert schrieb:
>>>>>>> Hi !
>>>>>>>
>>>>>>> I heard that it is always better to use a full JOIN rather than a
>>>>>>> LEFT
>>>>>>> JOIN, for performance reasons.  Is that correct ?
>>>>>>
>>>>>> please note that a JOIN and a LEFT JOIN are tow different things.
>>>>>> Let's assume you have
>>>>>> two tables (without any sens):
>>>>>>
>>>>>> books:
>>>>>>
>>>>>> headline |   content
>>>>>> - ----------+-------------
>>>>>> politics | russia
>>>>>> politics | middle east
>>>>>> politics | germany
>>>>>>
>>>>>> interests:
>>>>>>
>>>>>> content
>>>>>> - ---------
>>>>>> germany
>>>>>>
>>>>>> Then fire two select statements like this:
>>>>>>
>>>>>> #1:
>>>>>> SELECT a.headline,a.content,b.content as contentb
>>>>>> FROM books a
>>>>>> JOIN interests b ON a.content = b.content;
>>>>>>
>>>>>>  headline | content | contentb
>>>>>> - ----------+---------+----------
>>>>>> politics | germany | germany
>>>>>> (1 row)
>>>>>>
>>>>>> #2:
>>>>>> SELECT a.headline,a.content,b.content as contentb
>>>>>> FROM books a
>>>>>> LEFT JOIN interests b on a.content = b.content;
>>>>>> headline |   content   | contentb
>>>>>> - ----------+-------------+----------
>>>>>> politics | russia      |
>>>>>> politics | middle east |
>>>>>> politics | germany     | germany
>>>>>> (3 rows)
>>>>>>
>>>>>>> But it's barely possible to use full JOINS all the time, since
>>>>>>> most of
>>>>>>> the lookup fields are not required.
>>>>>>> Example : in the customer table, the customer type is a looup field
>>>>>>> to a
>>>>>>> customer_type table.  But that is not required to post the record.
>>>>>>> So I
>>>>>>> was thinking to create a record in the customer_type table with ID =
>>>>>>> -1.  And in case the customer type is NULL, to assign it the
>>>>>>> value -1.
>>>>>>> That way, it will be always possible to do a full JOIN.  I was
>>>>>>> wondering
>>>>>>> if that is good practice or rather nonsense ?
>>>>>>
>>>>>> Hm concerning the functionality of LEFT JOIN I do not really
>>>>>> understand what you wanna do
>>>>>> here. You created the customer_type table to have the possibility to
>>>>>> give more types to
>>>>>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>>>>>> statement could be like:
>>>>>>
>>>>>> SELECT name a
>>>>>> FROM customer a, cutomer_type b
>>>>>> WHERE a.id = b.id
>>>>>> AND b.type = 'super customer'
>>>>>>
>>>>>> Or not?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Andy
>>>>>
>>>>>
>>>>> Hi Andreas,
>>>>>
>>>>> Thanks for the reply.  I know the difference between JOIN and LEFT
>>>>> JOIN.
>>>>> The question was just if there is a performance hit between the 2 ?
>>>>>
>>>>> I never use constructs like this : SELECT name a
>>>>>> FROM customer a, cutomer_type b
>>>>>> WHERE a.id = b.id
>>>>>> AND b.type = 'super customer'
>>>>>
>>>>> Always Joins.
>>>>> I will try to clarify.  The current situation is : the
>>>>> customer_type has
>>>>> no value, so = NULL, no problem with LEFT JOIN.
>>>>
>>>> Why do you need an empty entry in this table? Your application should
>>>> take care, that this
>>>> does not happen ...
>>>>
>>>>> But I'm wondering with tables that has thousands of records, if LEFT
>>>>> JOINS are performing well ?
>>>>
>>>> EXPLAIN ANALYZE with my examples:
>>>>
>>>> #1
>>>> test=# explain analyze select a.headline,a.content,b.content as
>>>> contentb from books a join
>>>> interests b on a.content = b.content;
>>>>                                                    QUERY PLAN
>>>>
>>>> -
>>>> ------------------------------------------------------------------------------------------------------------------
>>>>
>>>>
>>>> Hash Join  (cost=10.88..23.67 rows=70 width=1548) (actual
>>>> time=0.051..0.058 rows=1 loops=1)
>>>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>>>   ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140 width=516)
>>>> (actual
>>>> time=0.007..0.010 rows=1 loops=1)
>>>>   ->  Hash  (cost=10.70..10.70 rows=70 width=1032) (actual
>>>> time=0.026..0.026 rows=3 loops=1)
>>>>         ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032)
>>>> (actual
>>>> time=0.004..0.013 rows=3 loops=1)
>>>> Total runtime: 0.094 ms
>>>> (6 rows)
>>>>
>>>>
>>>> #2
>>>> test=# explain analyze select a.headline,a.content,b.content as
>>>> contentb from books a
>>>> left join interests b on a.content = b.content;
>>>>                                                      QUERY PLAN
>>>>
>>>> -
>>>>
----------------------------------------------------------------------------------------------------------------------
>>>>
>>>>
>>>> Hash Left Join  (cost=11.75..23.72 rows=70 width=1548) (actual
>>>> time=0.038..0.062 rows=3
>>>> loops=1)
>>>>   Hash Cond: (("outer".content)::text = ("inner".content)::text)
>>>>   ->  Seq Scan on books a  (cost=0.00..10.70 rows=70 width=1032)
>>>> (actual
>>>> time=0.007..0.014 rows=3 loops=1)
>>>>   ->  Hash  (cost=11.40..11.40 rows=140 width=516) (actual
>>>> time=0.015..0.015 rows=1 loops=1)
>>>>         ->  Seq Scan on interests b  (cost=0.00..11.40 rows=140
>>>> width=516) (actual
>>>> time=0.004..0.007 rows=1 loops=1)
>>>> Total runtime: 0.102 ms
>>>> (6 rows)
>>>>
>>>> I fired each statement ten times. But seriously - this is more or less
>>>> just a hint that a
>>>> LEFT JOIN is less performant tahn a JOIN. To get that exactly it
>>>> depend on which execution
>>>> plan the planner is using. So the best way ist to use EXPALIN ANALYZE
>>>> with your statements.
>>>>
>>>> Sorry for the big post with much output ...
>>>>
>>>> Cheers
>>>>
>>>> Andy
>>>>
>>>
>>>
>>> Hi Andy,
>>>
>>> Thanks again for the fast reply !
>>>
>>>> Why do you need an empty entry in this table? Your application should
>>>> take care, that this
>>>> does not happen ...
>>>
>>> That is the whole point of my doubts.  When a user creates a new
>>> customer, it's not an obligation to enter a customer_type, and still
>>> many other fields, like tax, payment, yellow pages, payment method,
>>> etc... related things.  So I always have lots of LEFT JOIN's in my
>>> queries, because of the NULL fields.  You said the application should
>>> handle it.  So you mean, even the field is not obligatory, you would put
>>> a value on it ?  All lookup tables are of course linked through foreign
>>> keys to enforce data integrity. So what value would you enter ?  Like 0,
>>> or -1 ??  And make sure there is a 0, or -1 record in the lookup table
>>> ?  Otherwise the foreign key will complain.
>>> I think I almost get your point, just need a last little push :-)
>>
>> Ah ok - now we are more in database/application design ;-). Here's
>> another question: are
>> you really sure that the databse design is correct? Think about having
>> all the info like
>> customer_type, tax, payment, yellow pages, payment method in the
>> customer table also.
>> Menas - do you really need for these info more than one entry for one
>> customer? So are you
>> sure you need the 1:n relation in all cases? I think not, so that
>> would mean you put the
>> info about e.g tax, payment method and so on in the customer table as
>> well and not in one
>> (or more ...? you were writing "all referenced tables") referenced
>> tables. As a result you
>> will have less LEFT JOINS ...
>>
>> Cheers
>>
>> Andy
>>
>
> Hi Andy,
>
> I don't know what are the group policies here ? In case I have to cut a
> part of the message, just tell me, in case it gets too long.
> And in case it gets off topic, just send me a warning :-)

don't worry ;-)

> Well, yes, we need all those 1:n relationships, because customer type
> can be prospect, customer, not active.  Payment methods can be visa,
> cash, wire transfer, etc...  Yellow pages are the category numbers.  VAT
> rules can be 21%, 6%, etc...  All of these fields are feeded through

Ok please don't get me wrong for asking again: one customer can have two tax rates? Why?
Because he's living in the USA and has a taxrate of 25% and lives also in Germany and has
19%? Hm ... strange ... ;-) . And also the custumer is 'prospect' OR 'not active' ... but
not both ... ;-)

> drop downs in the application, so the customer can easily select a value.

So all these dropdowns are multiselect ...?

I am not sure if I can help you with this. My idea is really to think about the design of
the application and / or the database ... again - please don't get me wrong ...
> Nico
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgFKLVa7znmSP9AwRAlGkAJ9imdvkfk3iTw146Eru3BFB1llPZQCgqPXb
fYhH9zDowHAtjAYjfAHlhcM=
=Kpf5
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
> Ok please don't get me wrong for asking again: one customer can have two
> tax rates? Why?
> Because he's living in the USA and has a taxrate of 25% and lives also in
> Germany and has
> 19%? Hm ... strange ... ;-) . And also the custumer is 'prospect' OR 'not
> active' ... but
> not both ... ;-)
>
>> drop downs in the application, so the customer can easily select a value.
>
> So all these dropdowns are multiselect ...?
>
> I am not sure if I can help you with this. My idea is really to think
> about the design of
> the application and / or the database ... again - please don't get me
> wrong ...


Hi Andreas,

I think we are not understanding each other :-)
It's al the time just 1 value, 1 value for tax, 1 value for customer type.
No multiselect drop downs.  But all these fields are of course referencing a
lookup table by ID.  But like Regina also said, the performance hit of the
LEFT JOIN seems not worth considering putting junk data in the database that
should be NULL (if the drop down is left blanc)...
I don't know if I made myself clear ?

Thanks !
Nico



Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Obe, Regina schrieb:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Nico Callewaert schrieb:
>>> Hi !
>>>
>>> I heard that it is always better to use a full JOIN rather than a
> LEFT
>>> JOIN, for performance reasons.  Is that correct ?
>> please note that a JOIN and a LEFT JOIN are tow different things.
> Let's
>> assume you have
>> two tables (without any sens):
>>
>
> This is a bit of the nitpick, but please don't call an INNER JOIN a FULL
> JOIN.
>
> Those are 2 separate animals altogether.
>
> JOIN and INNER JOIN are the same, but I tend to put in the word INNER
> for clarity
> even though some may consider it redundant.
>
> So in terms of performance
>
> [INNER] JOIN -- fastest
> LEFT JOIN -- generally slower (but there really is no alternative if you
> don't want to leave out records  without matches
>             and the performance hit is not worth the effort
> of changing your data model to put in junk data that
>             should be rightfully NULL, plus there are other
> nifty tricks you can perform with LEFTS as I have here)
>
> http://www.bostongis.com/blog/index.php?/archives/37-Explain-Analyze-Geo
> metry-Relation-Operators-and-Joins-Except-Where.html
>
> FULL JOIN -- rarely used and slower than INNER, LEFT, RIGHT, but it
> comes in handy at times.
>
> Okay enough of my preaching.

Cool - this is a good "preaching" to be totally correct and academic ;-)

Thank you !

Cheers

Andy

- --
St.Pauli - Hamburg - Germany

Andreas Wenk

>
> Go forth and prosper.
>
> Thanks,
> Regina
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgFMvVa7znmSP9AwRAgqIAKCk8z1rJLXaS8GnK7FsWdOjl+LjawCgnGvm
WXob+X3CZt0VRSgu8fM9ZaY=
=3UTC
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
Andreas Wenk
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nico Callewaert schrieb:
>
>> Ok please don't get me wrong for asking again: one customer can have
>> two tax rates? Why?
>> Because he's living in the USA and has a taxrate of 25% and lives also
>> in Germany and has
>> 19%? Hm ... strange ... ;-) . And also the custumer is 'prospect' OR
>> 'not active' ... but
>> not both ... ;-)
>>
>>> drop downs in the application, so the customer can easily select a
>>> value.
>>
>> So all these dropdowns are multiselect ...?
>>
>> I am not sure if I can help you with this. My idea is really to think
>> about the design of
>> the application and / or the database ... again - please don't get me
>> wrong ...
>
>
> Hi Andreas,
>
> I think we are not understanding each other :-)

sometimes I am good with that ;-)

> It's al the time just 1 value, 1 value for tax, 1 value for customer
> type. No multiselect drop downs.  But all these fields are of course
> referencing a lookup table by ID.  But like Regina also said, the
> performance hit of the LEFT JOIN seems not worth considering putting
> junk data in the database that should be NULL (if the drop down is left
> blanc)...
> I don't know if I made myself clear ?

yes you did - don't worry. I have just another idea how I would do that. But anyway I
think the basic question was if a LEFT JOIN has less performance comparing with a JOIN. So
now we have the answer for your case and that hopefully helps you to get your software
work well.

In all cases EXPLAIN ANALYZE is a good idea to use ...

> Thanks !
> Nico
>
Cheers

Andy

- --
St.Pauli - Hamburg - Germany

Andreas Wenk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJgFbqVa7znmSP9AwRAiqfAJ9TG2B+oqjvItHz9eIEk26oQkdBRQCgnpNc
NlJbJKUmidRrHk7b8k624DM=
=uliw
-----END PGP SIGNATURE-----

Re: JOIN vs. LEFT JOIN

От
Tom Lane
Дата:
"Obe, Regina" <robe.dnd@cityofboston.gov> writes:
> So in terms of performance

> [INNER] JOIN -- fastest
> LEFT JOIN -- generally slower (but there really is no alternative if you
> don't want to leave out records  without matches

This is nonsense.  A left join is not inherently slower than an inner
join.

What *is* true is that a left join constrains the optimizer more than an
inner join, ie some join reorderings are allowed for inner joins but
would change the answers if an outer join is involved.  So in the
context of a specific query you might get a slower plan if you use a
left join.  But you can't say that as a blanket statement.  In a lot
of cases there won't be any difference at all (particularly with more
recent PG versions --- our optimizer has gotten smarter about outer
joins over time).

> FULL JOIN -- rarely used and slower than INNER, LEFT, RIGHT, but it
> comes in handy at times.

Likewise, a full join isn't necessarily slow in itself, but it
constrains the possible plans quite a lot.

            regards, tom lane

Re: JOIN vs. LEFT JOIN

От
"Nico Callewaert"
Дата:
Thanks a lot Tom for explaining !


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Obe, Regina" <robe.dnd@cityofboston.gov>
Cc: "Nico Callewaert" <callewaert.nico@telenet.be>; "Andreas Wenk"
<a.wenk@netzmeister-st-pauli.de>; <pgsql-novice@postgresql.org>
Sent: Wednesday, January 28, 2009 5:49 PM
Subject: Re: [NOVICE] JOIN vs. LEFT JOIN


> "Obe, Regina" <robe.dnd@cityofboston.gov> writes:
>> So in terms of performance
>
>> [INNER] JOIN -- fastest
>> LEFT JOIN -- generally slower (but there really is no alternative if you
>> don't want to leave out records  without matches
>
> This is nonsense.  A left join is not inherently slower than an inner
> join.
>
> What *is* true is that a left join constrains the optimizer more than an
> inner join, ie some join reorderings are allowed for inner joins but
> would change the answers if an outer join is involved.  So in the
> context of a specific query you might get a slower plan if you use a
> left join.  But you can't say that as a blanket statement.  In a lot
> of cases there won't be any difference at all (particularly with more
> recent PG versions --- our optimizer has gotten smarter about outer
> joins over time).
>
>> FULL JOIN -- rarely used and slower than INNER, LEFT, RIGHT, but it
>> comes in handy at times.
>
> Likewise, a full join isn't necessarily slow in itself, but it
> constrains the possible plans quite a lot.
>
> regards, tom lane
>
>