Обсуждение: typoed column name, but postgres didn't grump
I've been having trouble with a query. The query is a cross join between two tables. Initially, I mis-typed the query, and one of the columns specified in the query doesn't exist, however the query ran nonetheless. The actual query: select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city' and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ; However, there *is* no column 'name' in table 't2'. When I ran the query, it took a *really* long time to run (670 seconds). When I corrected the query to use the right column name (city_name), the query ran in 28ms. The question, then, is why didn't the postgres grump about the non-existent column name? The version is 8.4.5 on x86_64, openSUSE 11.3 PostgreSQL 8.4.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit -- Jon
Jon Nelson <jnelson+pgsql@jamponi.net> writes: > Initially, I mis-typed the query, and one of the columns specified in > the query doesn't exist, however the query ran nonetheless. > The actual query: > select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city' > and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ; > However, there *is* no column 'name' in table 't2'. This is the old automatic-cast-from-record-to-text-string issue, ie it treats this like "(t2.*)::name". We've been over this a few times before, but it's not clear that we can make this throw an error without introducing unpleasant asymmetry into the casting behavior, as in you couldn't get the cast when you did want it. BTW this seems pretty far off-topic for pgsql-performance. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > BTW this seems pretty far off-topic for pgsql-performance. It is once you understand what's happening. It was probably the 11+ minutes for the mistyped query run, versus the 28 ms without the typo, that led them to this list. I remembered this as an issued that has come up before, but couldn't come up with good search criteria for finding the old thread before you posted. If you happen to have a reference or search criteria for a previous thread, could you post it? Otherwise, a brief explanation of why this is considered a feature worth keeping would be good. I know it has been explained before, but it just looks wrong, on the face of it. Playing around with it a little, it seems like a rather annoying foot-gun which could confuse people and burn a lot of development time: test=# create domain make text; CREATE DOMAIN test=# create domain model text; CREATE DOMAIN test=# create table vehicle (id int primary key, make make); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "vehicle_pkey" for table "vehicle" CREATE TABLE test=# insert into vehicle values (1, 'Toyota'),(2,'Ford'),(3,'Rambler'); INSERT 0 3 test=# select v.make, v.model from vehicle v; make | model ---------+------------- Toyota | (1,Toyota) Ford | (2,Ford) Rambler | (3,Rambler) (3 rows) If someone incorrectly thinks they've added a column, and the purported column name happens to match any character-based type or domain name, they can get a query which behaves in a rather unexpected way. In this simple query it's pretty easy to spot, but it could surface in a much more complex query. If a mistyped query runs for 11 days instead of 11 minutes, they may have a hard time spotting the problem. A typo like this could be particularly hazardous in a DELETE or UPDATE statement. -Kevin
[ please continue any further discussion in pgsql-bugs only ] "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW this seems pretty far off-topic for pgsql-performance. > It is once you understand what's happening. It was probably the 11+ > minutes for the mistyped query run, versus the 28 ms without the > typo, that led them to this list. > I remembered this as an issued that has come up before, but couldn't > come up with good search criteria for finding the old thread before > you posted. If you happen to have a reference or search criteria > for a previous thread, could you post it? Otherwise, a brief > explanation of why this is considered a feature worth keeping would > be good. I know it has been explained before, but it just looks > wrong, on the face of it. What's going on here is an unpleasant interaction of several different features: 1. The notations a.b and b(a) are equivalent: either one can mean the column b of a table a, or an invocation of a function b() that takes a's composite type as parameter. This is an ancient PostQUEL-ism, but we've preserved it because it is helpful for things like emulating computed columns via functions. 2. The notation t(x) will be taken to mean x::t if there's no function t() taking x's type, but there is a cast from x's type to t. This is just as ancient as #1. It doesn't really add any functionality, but I believe we would break a whole lot of users' code if we took it away. Because of #1, this also means that x.t could mean x::t. 3. As of 8.4 or so, there are built-in casts available from pretty much any type (including composites) to all the built-in string types, viz text, varchar, bpchar, name. Upshot is that t.name is a cast to type "name" if there's no column or user-defined function that can match the call. We've seen bug reports on this with respect to both the "name" and "text" cases, though I'm too lazy to trawl the archives for them just now. So, if you want to throw an error for this, you have to choose which of these other things you want to break. I think if I had to pick a proposal, I'd say we should disable #2 for the specific case of casting a composite type to something else. The intentional uses I've seen were all scalar types; and before 8.4 there was no built-in functionality that such a call could match. If we slice off some other part of the functionality, we risk breaking apps that've worked for many years. regards, tom lane