Обсуждение: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions

Поиск
Список
Период
Сортировка

7.2. Table Expressions FULL join is only supported with merge-joinable join conditions

От
Grzegorz Szpetkowski
Дата:
http://www.postgresql.org/docs/9.0/static/queries-table-expressions.html

"The join condition specified with ON can also contain conditions that
do not relate directly to the join. This can prove useful for some
queries but needs to be thought out carefully. For example:

=> SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';"

Can you add information about FULL JOIN in doc there ? I am trying to
run such query:

SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';
ERROR:  FULL JOIN is only supported with merge-joinable join conditions

I mean add something like: "Note that you can't use such conditions
with FULL JOIN, only equality of selected columns is supported with
this type".

Thanks,
Grzegorz Szpetkowski

Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions

От
Robert Haas
Дата:
On Mon, May 16, 2011 at 7:32 PM, Grzegorz Szpetkowski
<gszpetkowski@gmail.com> wrote:
> http://www.postgresql.org/docs/9.0/static/queries-table-expressions.html
>
> "The join condition specified with ON can also contain conditions that
> do not relate directly to the join. This can prove useful for some
> queries but needs to be thought out carefully. For example:
>
> => SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';"
>
> Can you add information about FULL JOIN in doc there ? I am trying to
> run such query:
>
> SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';
> ERROR:  FULL JOIN is only supported with merge-joinable join conditions
>
> I mean add something like: "Note that you can't use such conditions
> with FULL JOIN, only equality of selected columns is supported with
> this type".

This works in 9.1.  A possibly relevant point is that the semantics
are not what you might think:

rhaas=# create table t1 (num int, value text);
CREATE TABLE
rhaas=# create table t2 (num int, value text);
CREATE TABLE
rhaas=# insert into t1 values (1, 'yyy'), (2, 'yyy');
INSERT 0 2
rhaas=# insert into t2 values (2, 'yyy'), (3, 'yyy');
INSERT 0 2
rhaas=# select * from t1 full join t2 on t1.num = t2.num and t2.value = 'xxx';
 num | value | num | value
-----+-------+-----+-------
   1 | yyy   |     |
   2 | yyy   |     |
     |       |   2 | yyy
     |       |   3 | yyy
(4 rows)

It's very possible that a user who is writing this meant one of the following:

select * from t1 full join (select * from t2 where t2.value = 'xxx')
t2 on t1.num = t2.num;
select * from t1 full join t2 on t1.num = t2.num WHERE t2.value = 'xxx';

...which are not equivalent to each other, or to the original query.
It'd be nice to document this better, but I don't have a clear feeling
for exactly what is needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions

От
"Kevin Grittner"
Дата:
Robert Haas <robertmhaas@gmail.com> wrote:
> Grzegorz Szpetkowski  <gszpetkowski@gmail.com> wrote:

>> "The join condition specified with ON can also contain conditions
>> that do not relate directly to the join."

I think the trouble starts with that sentence, which I believe to be
completely false and misleading.

Simplifying a real-life instance of such confusion among our
programmers:

SELECT <Party columns>, Demographic.dob
  FROM Party LEFT JOIN Demographic
  ON (<Party.pkcols = Demographic.pkcols>
        AND Demographic.dod is NULL);

Which makes absence of date of death part of the outer join
criteria.  So you get all the parties, dead or alive; and only show
date of birth for those not known to be dead.  What they really
wanted to do was exclude parties known to be dead, and for those
parties listed, show date of birth if available.  So they wanted:

SELECT <Party columns>, Demographic.dob
  FROM Party LEFT JOIN Demographic
  ON (<Party.pkcols = Demographic.pkcols>)
  WHERE Demographic.dod is NULL;

Conditions in the ON clause *do* relate to the JOIN -- it's just
that the join might be on conditions other than primary key
equality.

Let's not contribute to muddy thinking by making incorrect
statements like that.

> I don't have a clear feeling for exactly what is needed.

I think the thing which is most likely to surprise people is that
the result can contain rows which are not in the Cartesian product
of joining the two relations.  We might want to point that out,
mention that it's an OUTER JOIN in *both* directions, and maybe give
an example which is half-way plausible as a use-case.  Maybe
something similar to:

test=# create table n_en (n int, word text);
CREATE TABLE
test=# create table n_de (n int, wort text);
CREATE TABLE
test=# insert into n_en values (1,'one'),(2,'two');
INSERT 0 2
test=# insert into n_de values (2, 'zwei'),(3,'drei');
INSERT 0 2
test=# select * from n_en full join n_de using (n);
 n | word | wort
---+------+------
 1 | one  |
 2 | two  | zwei
 3 |      | drei
(3 rows)

And that works to show the difference between:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n and n_de.n > 2);
 n | word | n | wort
---+------+---+------
 1 | one  |   |
 2 | two  |   |
   |      | 2 | zwei
   |      | 3 | drei
(4 rows)

and:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n) where n_de.n > 2;
 n | word | n | wort
---+------+---+------
   |      | 3 | drei
(1 row)

-Kevin