Re: proposal: unescape_text function

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: proposal: unescape_text function
Дата
Msg-id CAFj8pRBw-L4gsFLYJMMsLghio=qe9eh+QXWVj037VaCH+h6=eQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: proposal: unescape_text function  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Ответы Re: proposal: unescape_text function
Список pgsql-hackers


pá 27. 11. 2020 v 15:37 odesílatel Peter Eisentraut <peter.eisentraut@enterprisedb.com> napsal:
On 2020-10-07 11:00, Pavel Stehule wrote:
>         Since the idea originated from unescaping unicode string
>         literals i.e.
>                 select unescape('Odpov\u011Bdn\u00E1 osoba');
>
>         Shouldn't the built-in function support the above syntax as well?
>
>
>     good idea. The prefixes u (4 digits) and U (8 digits) are supported

I don't really get the point of this function.  There is AFAICT no
function to produce this escaped format, and it's not a recognized
interchange format.  So under what circumstances would one need to use this?

Some corporate data can be in CSV format with escaped unicode characters. Without this function it is not possible to decode these files without external application.

Postgres has support for this conversion, but only for string literals.

CREATE OR REPLACE FUNCTION public.unescape(text, text)  RETURNS text LANGUAGE plpgsql AS $function$ DECLARE result text; BEGIN   EXECUTE format('SELECT U&%s UESCAPE %s',                          quote_literal(replace($1, '\u','^')),                         quote_literal($2)) INTO result;   RETURN result; END; $function$

Because unicode is major encoding, I think this conversion should be supported. There is another question about implementation like in this patch implemented unicode_unescape function, or with some new conversion. Using conversion https://www.postgresql.org/docs/current/sql-createconversion.html  is probably better, but I am not sure how intuitive it is, and it is hard to use it (without not nice workarounds) in plpgsql.

I don't expect so Postgres should produce data in unicode escaped format, but can be useful, if Postgres can do some work with data in special format of major encoding.

postgres=# select convert_from(E'Odpov\u011Bdn\u00E1 osoba', 'UTF8');
┌─────────────────┐
│  convert_from   │
╞═════════════════╡
│ Odpovědná osoba │
└─────────────────┘
(1 row)

I can do this with bytea, but it is hard to use it with text fields.

I didn't find any way how to do it without ugly steps.

Regards

Pavel

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Следующее
От: Andreas Seltenreich
Дата:
Сообщение: [sqlsmith] Planner error on lateral joins