Обсуждение: string cast/compare broken?
Mac OS X: postgres% psql --version psql (PostgreSQL) 7.2.1 contains support for: multibyte LEDEV=# create table test1 (foo varchar(5)); CREATE LEDEV=# create table test2 (foo char(5)); CREATE LEDEV=# insert into test2 (foo) values ('S'); INSERT 3724249 1 LEDEV=# insert into test1 (foo) values ('S'); INSERT 3724250 1 LEDEV=# select a.foo, b.foo from test1 a, test2 b where a.foo = b.foo::text; foo | foo -----+----- (0 rows) LEDEV=# select a.foo = 'S', b.foo = 'S' from test1 a, test2 b; ?column? | ?column? ----------+---------- t | t (1 row) LEDEV=# select a.foo, b.foo from test1 a, test2 b where CAST(a.foo as CHAR) = b.foo; foo | foo -----+------- S | S (1 row)
Scott Royston <scroyston@mac.com> writes: > [ various examples of comparing char and varchar ] I see no bug here. For the CHAR datatype, trailing spaces are defined to be insignificant. For VARCHAR and TEXT, trailing spaces are significant. If you want to compare a CHAR value to a VARCHAR or TEXT value, your best bet is a locution likertrim(charval) = varcharval regards, tom lane
On Fri, 2002-07-12 at 08:50, Tom Lane wrote: > Scott Royston <scroyston@mac.com> writes: > > [ various examples of comparing char and varchar ] > > I see no bug here. For the CHAR datatype, trailing spaces are defined > to be insignificant. For VARCHAR and TEXT, trailing spaces are > significant. If you want to compare a CHAR value to a VARCHAR or TEXT > value, your best bet is a locution like > rtrim(charval) = varcharval I guess the strangest part was that both a.foo = 'S' and b.foo = 'S' but not a.foo=b.foo; (a.foo is varchar(5) , b.foo is char(5) ) I guess that tha 'S' that b.foo gets compared to is converted to 'S ' before comparison but when comparing varchar(5) and char(5) they are both compared by converting them to varchar which keeps the trailing spaces from char(5). If the conversion where varchar(5) --> char(5) then they would compare equal. I vaguely remember something in the standard about cases when comparing char() types should discard extra spaces. ------------- Hannu
> I guess the strangest part was that both a.foo = 'S' and b.foo = 'S' but > not a.foo=b.foo; (a.foo is varchar(5) , b.foo is char(5) ) > > I guess that tha 'S' that b.foo gets compared to is converted to 'S ' > before comparison but when comparing varchar(5) and char(5) they are > both compared by converting them to varchar which keeps the trailing > spaces from char(5). Yes, I think this is inconvenient/unintuitive. If it is doable according to standards, this should imho be fixed. > If the conversion where varchar(5) --> char(5) then > they would compare equal. I am not sure, since, if the varchar stored 'S ' then the comparison to a char 'S' should probably still fail, since those spaces in the varchar are significant. Informix compares them equal, so I guess argumentation can be made in that direction too (that currently evades my understanding of intuitive reasoning :-). Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: >> If the conversion where varchar(5) --> char(5) then >> they would compare equal. > I am not sure, since, if the varchar stored 'S ' then the comparison > to a char 'S' should probably still fail, There is no comparison of varchar to char: regression=# select 'z'::char = 'z'::varchar; ERROR: Unable to identify an operator '=' for types 'character' and 'character varying' You will have to retype thisquery using an explicit cast regression=# I consider this a feature, not a bug, since it's quite unclear which semantics ought to be used. The cases Scott originally posted all involved various forms of coercion to force both sides to be the same type; I'm not sure that he quite understood why he had to do that, but perhaps it's now becoming clear. I wonder whether it would be a good idea to stop considering char as binary-compatible to varchar and text. Instead we could set things up so that there is a coercion function involved, namely rtrim(). But that would probably make us diverge even further from the spec. Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging from the number of questions we get on this point, I have to wonder if we are not out of step with the way other systems do it. regards, tom lane
> Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging > from the number of questions we get on this point, I have to wonder > if we are not out of step with the way other systems do it. Well, I already gave the Informix example, that compares them as equal. (they obviously coerce varchar to char) In nearly all cases I have seen so far the different handling of trailing blanks is not wanted. In most of these varchar is simply used instead of char to save disk space. In Informix ESQL/C there is a host variable type CSTRINGTYPE that automatically rtrims columns of char type upon select. Imho the advantages of an automatic coercion would outweigh the few corner cases where the behavior would not be intuitive to everybody. Andreas
On Fri, 12 Jul 2002, Tom Lane wrote: > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > >> If the conversion where varchar(5) --> char(5) then > >> they would compare equal. > > > I am not sure, since, if the varchar stored 'S ' then the comparison > > to a char 'S' should probably still fail, > > There is no comparison of varchar to char: > > regression=# select 'z'::char = 'z'::varchar; > ERROR: Unable to identify an operator '=' for types 'character' and 'character varying' > You will have to retype this query using an explicit cast > regression=# > > I consider this a feature, not a bug, since it's quite unclear which > semantics ought to be used. > > The cases Scott originally posted all involved various forms of > coercion to force both sides to be the same type; I'm not sure > that he quite understood why he had to do that, but perhaps it's now > becoming clear. > > I wonder whether it would be a good idea to stop considering char > as binary-compatible to varchar and text. Instead we could set > things up so that there is a coercion function involved, namely > rtrim(). But that would probably make us diverge even further > from the spec. > > Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging > from the number of questions we get on this point, I have to wonder > if we are not out of step with the way other systems do it. I don't think it's just a CHAR vs VARCHAR issue. AFAICT the spec defines all of this in terms of the collations used and there are (imho arcane) rules about converting between them for comparisons and operations. Technically I think varcharcol=charcol *is* illegal if we are saying that char has a collation with PAD SPACE and varchar has a collation with NO PAD, because they're different collations and character value expressions from column reference are implicit and that doesn't allow comparison between two different collations. Of course I could also be misreading it.
There is no comparison of varchar to char in Oracle too. Scott provided cast cases are some unique features in psql, each database MAY handle those casting differently. In good design/application, char should be replaced by varchar type unless you know the exact bytes. It would be not bad idea to get rid of char gradually in the future to avoid such inconsistency between databases, that's just my view. johnl > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Zeugswetter > Andreas SB SD > Sent: Friday, July 12, 2002 8:49 AM > To: Tom Lane > Cc: Hannu Krosing; Scott Royston; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] string cast/compare broken? > > > > > Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging > > from the number of questions we get on this point, I have to wonder > > if we are not out of step with the way other systems do it. > > Well, I already gave the Informix example, that compares them as equal. > (they obviously coerce varchar to char) > > In nearly all cases I have seen so far the different handling of trailing > blanks is not wanted. In most of these varchar is simply used > instead of char to > save disk space. > > In Informix ESQL/C there is a host variable type CSTRINGTYPE that > automatically > rtrims columns of char type upon select. > > Imho the advantages of an automatic coercion would outweigh the > few corner cases > where the behavior would not be intuitive to everybody. > > Andreas > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
On Fri, Jul 12, 2002 at 03:48:59PM +0200, Zeugswetter Andreas SB SD wrote: > Imho the advantages of an automatic coercion would outweigh the few > corner cases where the behavior would not be intuitive to > everybody. How then would one get the correct behaviour from char()? A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110