Обсуждение: typoed column name, but postgres didn't grump

Поиск
Список
Период
Сортировка

typoed column name, but postgres didn't grump

От
Jon Nelson
Дата:
I've been having trouble with a query.
The query is a cross join between two tables.
Initially, I mis-typed the query, and one of the columns specified in
the query doesn't exist, however the query ran nonetheless.

The actual query:
select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city'
and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ;
However, there *is* no column 'name' in table 't2'.
When I ran the query, it took a *really* long time to run (670 seconds).
When I corrected the query to use the right column name (city_name),
the query ran in 28ms.

The question, then, is why didn't the postgres grump about the
non-existent column name?

The version is 8.4.5 on x86_64, openSUSE 11.3

 PostgreSQL 8.4.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit


--
Jon

Re: typoed column name, but postgres didn't grump

От
Tom Lane
Дата:
Jon Nelson <jnelson+pgsql@jamponi.net> writes:
> Initially, I mis-typed the query, and one of the columns specified in
> the query doesn't exist, however the query ran nonetheless.

> The actual query:
> select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city'
> and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ;
> However, there *is* no column 'name' in table 't2'.

This is the old automatic-cast-from-record-to-text-string issue,
ie it treats this like "(t2.*)::name".

We've been over this a few times before, but it's not clear that
we can make this throw an error without introducing unpleasant
asymmetry into the casting behavior, as in you couldn't get the
cast when you did want it.

BTW this seems pretty far off-topic for pgsql-performance.

            regards, tom lane

Re: typoed column name, but postgres didn't grump

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> BTW this seems pretty far off-topic for pgsql-performance.

It is once you understand what's happening.  It was probably the 11+
minutes for the mistyped query run, versus the 28 ms without the
typo, that led them to this list.

I remembered this as an issued that has come up before, but couldn't
come up with good search criteria for finding the old thread before
you posted.  If you happen to have a reference or search criteria
for a previous thread, could you post it?  Otherwise, a brief
explanation of why this is considered a feature worth keeping would
be good.  I know it has been explained before, but it just looks
wrong, on the face of it.

Playing around with it a little, it seems like a rather annoying
foot-gun which could confuse people and burn a lot of development
time:

test=# create domain make text;
CREATE DOMAIN
test=# create domain model text;
CREATE DOMAIN
test=# create table vehicle (id int primary key, make make);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"vehicle_pkey" for table "vehicle"
CREATE TABLE
test=# insert into vehicle values (1,
'Toyota'),(2,'Ford'),(3,'Rambler');
INSERT 0 3
test=# select v.make, v.model from vehicle v;
  make   |    model
---------+-------------
 Toyota  | (1,Toyota)
 Ford    | (2,Ford)
 Rambler | (3,Rambler)
(3 rows)

If someone incorrectly thinks they've added a column, and the
purported column name happens to match any character-based type or
domain name, they can get a query which behaves in a rather
unexpected way. In this simple query it's pretty easy to spot, but
it could surface in a much more complex query.  If a mistyped query
runs for 11 days instead of 11 minutes, they may have a hard time
spotting the problem.

A typo like this could be particularly hazardous in a DELETE or
UPDATE statement.

-Kevin

Re: typoed column name, but postgres didn't grump

От
Tom Lane
Дата:
[ please continue any further discussion in pgsql-bugs only ]

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW this seems pretty far off-topic for pgsql-performance.

> It is once you understand what's happening.  It was probably the 11+
> minutes for the mistyped query run, versus the 28 ms without the
> typo, that led them to this list.

> I remembered this as an issued that has come up before, but couldn't
> come up with good search criteria for finding the old thread before
> you posted.  If you happen to have a reference or search criteria
> for a previous thread, could you post it?  Otherwise, a brief
> explanation of why this is considered a feature worth keeping would
> be good.  I know it has been explained before, but it just looks
> wrong, on the face of it.

What's going on here is an unpleasant interaction of several different
features:

1. The notations a.b and b(a) are equivalent: either one can mean the
column b of a table a, or an invocation of a function b() that takes
a's composite type as parameter.  This is an ancient PostQUEL-ism,
but we've preserved it because it is helpful for things like
emulating computed columns via functions.

2. The notation t(x) will be taken to mean x::t if there's no function
t() taking x's type, but there is a cast from x's type to t.  This is
just as ancient as #1.  It doesn't really add any functionality, but
I believe we would break a whole lot of users' code if we took it away.
Because of #1, this also means that x.t could mean x::t.

3. As of 8.4 or so, there are built-in casts available from pretty much
any type (including composites) to all the built-in string types, viz
text, varchar, bpchar, name.

Upshot is that t.name is a cast to type "name" if there's no column or
user-defined function that can match the call.  We've seen bug reports
on this with respect to both the "name" and "text" cases, though I'm
too lazy to trawl the archives for them just now.

So, if you want to throw an error for this, you have to choose which
of these other things you want to break.  I think if I had to pick a
proposal, I'd say we should disable #2 for the specific case of casting
a composite type to something else.  The intentional uses I've seen were
all scalar types; and before 8.4 there was no built-in functionality
that such a call could match.  If we slice off some other part of the
functionality, we risk breaking apps that've worked for many years.

            regards, tom lane