Обсуждение: How to handle results with column names clash

Поиск
Список
Период
Сортировка

How to handle results with column names clash

От
Bartlomiej Korupczynski
Дата:
Hi,

I'm curious how do you handle results from multiple tables with
repeated column names. For example:

# CREATE TABLE c1 (id integer PRIMARY KEY, address inet);
# CREATE TABLE c2 (id integer PRIMARY KEY, address text);
# SELECT * FROM c1 JOIN c2 USING (id);
 id | address | address
----+---------+---------
(0 rows)
or:
# SELECT * FROM c1, c2 WHERE c1.id=c2.id;
 id | address | id | address
----+---------+----+---------
(0 rows)

Now lets say we want access results from PHP/perl/etc using column
names. We have "address" from c1, and the same from c2. We can't even
distinguish which one is from which table.

I see two available possibilities:
1. rename one or each column (eg. prefix with table name), but it's not
always acceptable and makes JOIN ... USING syntax useless (and it's
messy to change to JOIN .. ON for many columns), it would also not work
if we join on the same table twice or more,
2. select each column explicitly:
  SELECT c1.id, c1.address AS c1_address, c2.address AS c2.address
but this is nightmare for tables with many columns, especially if the
schema changes frequently.

Someone could say, that if we JOIN on some column, then it's the same
value, but it does not need to be always true -- we can join on
different columns in different queries.

Any other ideas?


3. Suggestion, but it would be probably hard to implement: to make SQL
engine prefix each returned column with table alias. Of course it would
not be a default behavior, but it would be enabled by some session wide
setting.

# SELECT * FROM c1, c2 WHERE c1.id=c2.id;
 c1.id | c1.address | c2.id | c2.address
[...]
# SELECT * FROM c1 JOIN c2 USING (id);
 ??id | c1.address | c2.address

As JOIN returns only one copy of id, it would be hard to decide about
results (could return one copy for each alias like above).

4. Probably also hard to implement, something like:
# SELECT c1.* AS c1_*, c2.* AS c2_* FROM ...


Or maybe 3 or 4 are already there?


Regards,
BK

Re: How to handle results with column names clash

От
Raymond O'Donnell
Дата:
On 28/09/2010 23:53, Bartlomiej Korupczynski wrote:
> Hi,
>
> I'm curious how do you handle results from multiple tables with
> repeated column names. For example:
>
> # CREATE TABLE c1 (id integer PRIMARY KEY, address inet);
> # CREATE TABLE c2 (id integer PRIMARY KEY, address text);
> # SELECT * FROM c1 JOIN c2 USING (id);
>   id | address | address
> ----+---------+---------
> (0 rows)
> or:
> # SELECT * FROM c1, c2 WHERE c1.id=c2.id;
>   id | address | id | address
> ----+---------+----+---------
> (0 rows)
>
> Now lets say we want access results from PHP/perl/etc using column
> names. We have "address" from c1, and the same from c2. We can't even
> distinguish which one is from which table.
>
> I see two available possibilities:
> 1. rename one or each column (eg. prefix with table name), but it's not
> always acceptable and makes JOIN ... USING syntax useless (and it's
> messy to change to JOIN .. ON for many columns), it would also not work
> if we join on the same table twice or more,
> 2. select each column explicitly:
>    SELECT c1.id, c1.address AS c1_address, c2.address AS c2.address
> but this is nightmare for tables with many columns, especially if the
> schema changes frequently.


In PHP you can access columns by index, using pg_fetch_array().

However, I think it's better to embrace the pain and use aliases for the
columns with duplicated names - makes your code much easier to read.

You could also create a view which defines the aliases for the columns,
presenting a consistent interface to the PHP code.

Ray.


--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

Re: How to handle results with column names clash

От
Darren Duncan
Дата:
Bartlomiej Korupczynski wrote:
> I'm curious how do you handle results from multiple tables with
> repeated column names. For example:
>
> # CREATE TABLE c1 (id integer PRIMARY KEY, address inet);
> # CREATE TABLE c2 (id integer PRIMARY KEY, address text);
> # SELECT * FROM c1 JOIN c2 USING (id);
>  id | address | address
> ----+---------+---------
> (0 rows)
> or:
> # SELECT * FROM c1, c2 WHERE c1.id=c2.id;
>  id | address | id | address
> ----+---------+----+---------
> (0 rows)
>
> Now lets say we want access results from PHP/perl/etc using column
> names. We have "address" from c1, and the same from c2. We can't even
> distinguish which one is from which table.

The only proper solution is for every resultset column to have a distinct
unqualified name, full-stop.

If you are joining tables that use the same name for different things, then you
have two good options:

1.  Rename the table columns to be unique, such as using "inet_addr" and
"street_addr".

2.  Use "AS" in your query to give the result columns unique names.

Similarly, id columns should be more descriptive to say what they are the id of
(eg, artist_id, track_id, etc), and use the same name for columns containing the
same data, and different names for different data, so approach #1; the main time
to deviate from this is if you have several columns with the same kind of data,
and then you use #2.

> I see two available possibilities:
> 1. rename one or each column (eg. prefix with table name), but it's not
> always acceptable and makes JOIN ... USING syntax useless (and it's
> messy to change to JOIN .. ON for many columns), it would also not work
> if we join on the same table twice or more,

Don't prefix with the table name if that doesn't make sense.  In your case, you
could call the field "c_id" in both tables for example.

Generally speaking, you *do* want a situation that lets you use "JOIN ... USING"
wherever possible.

> 2. select each column explicitly:
>   SELECT c1.id, c1.address AS c1_address, c2.address AS c2.address
> but this is nightmare for tables with many columns, especially if the
> schema changes frequently.

If you give the table columns good names, you generally won't have to do that.

> Someone could say, that if we JOIN on some column, then it's the same
> value, but it does not need to be always true -- we can join on
> different columns in different queries.

Yes you can, but with a well designed schema you would be joining on same-named
columns most of the time, and for the rest, you can use AS.

> Any other ideas?

I've given mine.

> 3. Suggestion, but it would be probably hard to implement: to make SQL
> engine prefix each returned column with table alias. Of course it would
> not be a default behavior, but it would be enabled by some session wide
> setting.
>
> # SELECT * FROM c1, c2 WHERE c1.id=c2.id;
>  c1.id | c1.address | c2.id | c2.address
> [...]
> # SELECT * FROM c1 JOIN c2 USING (id);
>  ??id | c1.address | c2.address
>
> As JOIN returns only one copy of id, it would be hard to decide about
> results (could return one copy for each alias like above).
>
> 4. Probably also hard to implement, something like:
> # SELECT c1.* AS c1_*, c2.* AS c2_* FROM ...

Some DBMSs already do this, and is a *bad* idea.

The fact that SQL lets you have a rowset with column names either duplicated or
missing is a horrible misfeature and one shouldn't rely on it.

-- Darren Duncan

Re: How to handle results with column names clash

От
Darren Duncan
Дата:
Darren Duncan wrote:
>> 3. Suggestion, but it would be probably hard to implement: to make SQL
>> engine prefix each returned column with table alias. Of course it would
>> not be a default behavior, but it would be enabled by some session wide
>> setting.
>>
>> # SELECT * FROM c1, c2 WHERE c1.id=c2.id;
>>  c1.id | c1.address | c2.id | c2.address
>> [...]
>> # SELECT * FROM c1 JOIN c2 USING (id);
>>  ??id | c1.address | c2.address
>>
>> As JOIN returns only one copy of id, it would be hard to decide about
>> results (could return one copy for each alias like above).
>>
>> 4. Probably also hard to implement, something like:
>> # SELECT c1.* AS c1_*, c2.* AS c2_* FROM ...
>
> Some DBMSs already do this, and is a *bad* idea.

Actually, I should clarify that it is the top 2 examples that some DBMSs already
do, and that's a bad idea.

What you proposed in #4 looks unique and might actually be useful, that just
being a shorthand for mass regular AS renames.

But what would be *more* useful in general is if SQL supported an all-but
syntax, where you explicitly named the columns you don't want when that is a
shorter list.  I know I've proposed this before.

-- Darren Duncan