Обсуждение: BUG #14920: TEXT binding not works correctly with BPCHAR

Поиск
Список
Период
Сортировка

BUG #14920: TEXT binding not works correctly with BPCHAR

От
jorsol@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      14920
Logged by:          Jorge Solorzano
Email address:      jorsol@gmail.com
PostgreSQL version: 10.1
Operating system:   Ubuntu 16.04
Description:

TEXT type is the preferred data type for the category String, but I'm having
a hard time using it with bpchar:

The reproducible use case is here:
1. Create a table with a bpchar(3).
2. Insert a tuple with trailing space.
3. If I select the values using varchar, it works fine with and without
trailing space.
4. If I select the values using text, it only works without trailing
space.

This behavior is odd and is not documented (at least I not found it). This
can be a limiting factor to use the text data type for strings instead of
varchar.

------------------------------------------

DROP TABLE IF EXISTS texttable;
CREATE TABLE texttable (
 ch bpchar(3), te text, vc varchar(3)
);

PREPARE fooplaninsert (text, text, text) AS
    INSERT INTO texttable (ch, te, vc) VALUES ($1, $2, $3);
EXECUTE fooplaninsert('c  ', 'c  ', 'c  ');

-- USING VARCHAR
PREPARE fooplanselectvc (varchar, varchar, varchar) AS
    SELECT ch, te, vc FROM texttable WHERE ch=$1 AND te=$2 AND vc=$3;
-- THIS WORKS WITH AND WITHOUT TRAILING SPACE
EXECUTE fooplanselectvc('c  ', 'c  ', 'c  ');
EXECUTE fooplanselectvc('c', 'c  ', 'c  ');

-- USING TEXT
PREPARE fooplanselecttx (text, text, text) AS
    SELECT ch, te, vc FROM texttable WHERE ch=$1 AND te=$2 AND vc=$3;
-- THIS NOT WORKS WITH TRAILING SPACE
EXECUTE fooplanselecttx('c  ', 'c  ', 'c  ');
-- THIS WORKS WITHOUT TRAILING SPACE
EXECUTE fooplanselecttx('c', 'c  ', 'c  ');


Re: BUG #14920: TEXT binding not works correctly with BPCHAR

От
Tom Lane
Дата:
jorsol@gmail.com writes:
> TEXT type is the preferred data type for the category String, but I'm having
> a hard time using it with bpchar:

I believe what you're showing here can be reduced to these cases:

regression=# select 'c'::char(3) = 'c'::text;?column? 
----------t
(1 row)

regression=# select 'c'::char(3) = 'c  '::text;?column? 
----------f
(1 row)

That is, the = operator is resolved as text = text, for which
trailing spaces in the strings are significant.  But when we
promote the bpchar value to text, we strip its trailing spaces,
which are deemed not significant.  So we have 'c' = 'c' and
'c' != 'c  '.

regression=# select 'c'::char(3) = 'c'::varchar;?column? 
----------t
(1 row)

regression=# select 'c'::char(3) = 'c  '::varchar;?column? 
----------t
(1 row)

Here, the = operator is resolved as bpchar = bpchar, in which
trailing spaces aren't significant period.

I forget at the moment exactly why these choices of how to resolve
the ambiguous comparison operator get made, but most likely it has
to do with text being a preferred type while varchar hasn't even
got any operators of its own.  Reading the type conversions chapter
of the manual would help you identify why that happens.

These behaviors are of long standing and we're very unlikely to
change them.  If you don't like them, don't use bpchar; it's a
legacy datatype of little real value anyway.
        regards, tom lane


Re: BUG #14920: TEXT binding not works correctly with BPCHAR

От
Jorge Solórzano
Дата:
Thanks Tom,

That is a clear explanation,

Reading the type conversions chapter
of the manual would help you identify why that happens.

​Sadly the manual ​is not clear enough, one can read something like this:

All type conversion rules are designed with several principles in mind: * Implicit conversions should never have surprising or unpredictable outcomes


I think this falls down in surprising and unpredictable outcome​.

Also in:
Trailing spaces are removed when converting a character value to one of the other string types.


​I have searched the implicit conversions table​ and found that a cast from char to text and varchar both have rtrim1 as prosrc, so is still unclear to me this behavior.

select p.prosrc, * from pg_cast c join pg_proc p on c.castfunc = p.oid
where c.castsource = 'char'::regtype
and c.casttarget in ('text'::regtype, 'varchar'::regtype)


I understand that the bpchar is a legacy data type and should not be used, but the reason I ask is because I'm working in refactoring a driver for postgres wich use VARCHAR for sending strings in the protocol and I would like to change it to TEXT since is the prefered type, so my options are keep using VARCHAR or implicitly rtrim all strings send from the driver (is this a sane choice?)

thank you for your time,

cheers,


Jorge Solórzano

On Tue, Nov 21, 2017 at 6:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
jorsol@gmail.com writes:
> TEXT type is the preferred data type for the category String, but I'm having
> a hard time using it with bpchar:

I believe what you're showing here can be reduced to these cases:

regression=# select 'c'::char(3) = 'c'::text;
 ?column?
----------
 t
(1 row)

regression=# select 'c'::char(3) = 'c  '::text;
 ?column?
----------
 f
(1 row)

That is, the = operator is resolved as text = text, for which
trailing spaces in the strings are significant.  But when we
promote the bpchar value to text, we strip its trailing spaces,
which are deemed not significant.  So we have 'c' = 'c' and
'c' != 'c  '.

regression=# select 'c'::char(3) = 'c'::varchar;
 ?column?
----------
 t
(1 row)

regression=# select 'c'::char(3) = 'c  '::varchar;
 ?column?
----------
 t
(1 row)

Here, the = operator is resolved as bpchar = bpchar, in which
trailing spaces aren't significant period.

I forget at the moment exactly why these choices of how to resolve
the ambiguous comparison operator get made, but most likely it has
to do with text being a preferred type while varchar hasn't even
got any operators of its own.  Reading the type conversions chapter
of the manual would help you identify why that happens.

These behaviors are of long standing and we're very unlikely to
change them.  If you don't like them, don't use bpchar; it's a
legacy datatype of little real value anyway.

                        regards, tom lane