Re: text and bytea

Поиск
Список
Период
Сортировка
От hernan gonzalez
Тема Re: text and bytea
Дата
Msg-id 48692c2d0802250852w7401bdc4i6f420d4eecf3e933@mail.gmail.com
обсуждение исходный текст
Ответ на Re: text and bytea  ("hernan gonzalez" <hgonzalez@gmail.com>)
Ответы Re: text and bytea
Список pgsql-general
Another example  (Psotgresql 8.3.0, UTF-8  server/client encoding)

test=# create table chartest ( c text);
test=# insert into chartest (c) values ('¡Hasta mañana!');
test=# create view vchartest as
 select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;

test=# select c,octet_length(c) from chartest ;
       c        | octet_length
----------------+--------------
 ¡Hasta mañana! |           16

test=# select c1,octet_length(c1) from vchartest ;
      c1      | octet_length
--------------+--------------
 Hasta maana! |           14

(the field is seen as as text by postgresql, with the default
encoding.. UTF8; it is actually not)

test=# select * from vchartest where c1 like '%a%';
      c1
--------------
 Hasta maana!
(1 row)

test=# select * from vchartest where c1 ilike '%a%';
ERROR:  invalid byte sequence for encoding "UTF8": 0xa1
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".

That "ilike" breaks and "like" is rather random, it seems that the
later has some optimization does not check the validty of the utf8
stream. But thats not the point. The point is that IMO postgresql
should always handle text in the backend encoding, there should no
exists funcions that are designed to produce/consume texts in other
encodings. Perhaps the "encode" function is ill defined, and should be
rethinked. Two alternatives:
1. For special binary-to-ascii encodings (base64,hex). Keep its
present signature but remove/deprecate the "escape" type. It returns a
text in the backend encoding.
2  For arbitrary binary encodings. Change its signature so that it
returns bytea.
Of course, all this applies symmetrically to decode().

Appart, and in accordance with this, I think to_ascii() should accept
only one text argument.

В списке pgsql-general по дате отправления:

Предыдущее
От: Gregory Stark
Дата:
Сообщение: Re: process pool
Следующее
От: Enrico Sirola
Дата:
Сообщение: Re: APEX / HTML DB for PostgreSQL