Обсуждение: RIGHT JOIN Table Ordering Question

Поиск
Список
Период
Сортировка

RIGHT JOIN Table Ordering Question

От
Tara Piorkowski
Дата:
Hello.

I have a question about the ordering of joining of tables using the 
RIGHT JOIN syntax in the FROM clause of a query. Specifically, I've 
noticed that if I have one table on which I attempt to right join to two 
other tables, the ordering of the join conditions is important in the 
absense of those tables being noted in the WHERE clause. To illustrate, 
I've prepared the following script...

<script>

--
-- make fresh tables
--

DROP TABLE goals;
DROP TABLE goal_progress_notes;
DROP TABLE progress_ratings;

DROP SEQUENCE goals_goal_id_seq;
DROP SEQUENCE progress_ratings_rating_id_seq;
DROP SEQUENCE goal_progress_goal_prog_not_seq;

CREATE TABLE goals(goal_id serial NOT NULL PRIMARY KEY, statement text NOT NULL);

CREATE TABLE progress_ratings(rating_id serial NOT NULL PRIMARY KEY, description varchar(25) NOT NULL);

CREATE TABLE goal_progress_notes(goal_prog_note_id serial NOT NULL PRIMARY KEY, goal_id int NOT NULL REFERENCES goals,
rating_idint NOT NULL REFERENCES progress_ratings, statement text NOT NULL);
 

--
-- put in some test data
--

INSERT INTO goals (statement)VALUES ('Will write a correct SQL92 LEFT JOIN.');

INSERT INTO goals (statement)VALUES ('Will write a correct SQL92 RIGHT JOIN.');

INSERT INTO progress_ratings (description)VALUES ('Achieved');

INSERT INTO progress_ratings (description)VALUES ('Not Achieved');

INSERT INTO goal_progress_notes (goal_id, rating_id, statement)VALUES (1, 1, 'Tara is doing well.');

--
-- and run the queries in question
--

SELECT a.goal_prog_note_id,c.goal_id,c.statement,a.rating_id,b.description,a.statement
FROM goal_progress_notes aRIGHT JOIN progress_ratings b USING (rating_id)RIGHT JOIN goals c USING (goal_id)
WHERE c.goal_id = 1;

SELECT a.goal_prog_note_id,c.goal_id,c.statement,a.rating_id,b.description,a.statement
FROM goal_progress_notes aRIGHT JOIN progress_ratings b USING (rating_id)RIGHT JOIN goals c USING (goal_id)
WHERE c.goal_id = 2;

SELECT a.goal_prog_note_id,c.goal_id,c.statement,a.rating_id,b.description,a.statement
FROM goal_progress_notes aRIGHT JOIN goals c USING (goal_id)RIGHT JOIN progress_ratings b USING (rating_id)
WHERE c.goal_id = 1;

SELECT a.goal_prog_note_id,c.goal_id,c.statement,a.rating_id,b.description,a.statement
FROM goal_progress_notes aRIGHT JOIN goals c USING (goal_id)RIGHT JOIN progress_ratings b USING (rating_id)
WHERE c.goal_id = 2;

</script>

There are two sets of two queries at the end of the script which 
illustrate my question. In each set of queries there is one query in 
which a goal_progress_note (the a table) has an existing row and one in 
which it does not. In the first set, I get my desired result in both 
queries, the second one being key. That is, everything in tables b and c 
gets right joined (and the a data is null, of course).

However, when I switch the ordering of the two RIGHT JOIN 
specifications, the second query no longer produces any rows.

Obviously, I can get what I need by writing the query the first way. 
However, I couldn't explain why I got differences, despite doing some 
unsuccessful research into why this was. Thus, in the interest of 
understanding fully the RIGHT JOIN/LEFT JOIN sequence (I originally come 
from an Oracle SQL background, which is different) I would like to 
understand why the ordering makes a difference.

Thanks for any explanations anyone can give.

- Tara

-- 
Tara Piorkowski
System Administrator, vilaj.com
<http://www.vilaj.com/>



Re: RIGHT JOIN Table Ordering Question

От
Tom Lane
Дата:
Tara Piorkowski <tara@vilaj.com> writes:
> I have a question about the ordering of joining of tables using the 
> RIGHT JOIN syntax in the FROM clause of a query. Specifically, I've 
> noticed that if I have one table on which I attempt to right join to two 
> other tables, the ordering of the join conditions is important

Yup, it sure is.  Outer joins are not associative (nor commutative,
obviously).  When you write (a RIGHT JOIN b) RIGHT JOIN c, you first
have the result of the A/B join, which will be guaranteed to produce all
the joined rows an inner join would produce, plus a row with null A
values for each otherwise unmatched B row.  Then you take this and join
it to C, again adding rows for each unmatched C row.  This is not
associative because what's matched or not in the second step may depend
on what got added in the first step.  In your example, try looking at
the whole join results without any WHERE filter:

regression=# select * from goal_progress_notes a
regression-# RIGHT JOIN goals c USING (goal_id);goal_id | goal_prog_note_id | rating_id |      statement      |
     statement
 
---------+-------------------+-----------+---------------------+----------------------------------------      1 |
         1 |         1 | Tara is doing well. | Will write a correct SQL92 LEFT JOIN.      2 |                   |
   |                     | Will write a correct SQL92 RIGHT JOIN.
 
(2 rows)

regression=# select * from goal_progress_notes a
regression-# RIGHT JOIN goals c USING (goal_id)
regression-# RIGHT JOIN progress_ratings b USING (rating_id);rating_id | goal_id | goal_prog_note_id |      statement
  |               statement               | description
 
-----------+---------+-------------------+---------------------+---------------------------------------+--------------
     1 |       1 |                 1 | Tara is doing well. | Will write a correct SQL92 LEFT JOIN. | Achieved        2
|        |                   |                     |                                       | Not Achieved
 
(2 rows)

I would imagine that what you really want in this example is(goals c LEFT JOIN goal_progress_notes a USING (goal_id))
LEFT JOIN progress_ratings b USING (rating_id)
 
since you definitely want an output row for every goals row whether
there are notes or not, and you don't really want output rows for
progress ratings that happen to not be used currently (do you)?
Depending on your viewpoint about goal progress notes that don't
match any goal, perhaps the first join should be a FULL join.

> understanding fully the RIGHT JOIN/LEFT JOIN sequence (I originally come 
> from an Oracle SQL background, which is different)

AFAIK Oracle has the same semantics for left/right joins, just an
obscure syntax.
        regards, tom lane