Обсуждение: psql '\copy to' and unicode escapes

Поиск
Список
Период
Сортировка

psql '\copy to' and unicode escapes

От
Steven Hirsch
Дата:
I fear that I'm missing something very obvious, but I cannot find a syntax that permits me to use an escaped hexadecimal representation in a CSV file and have that representation interpreted as the equivalent unicode character when inserting into the database.  Both client and server are using UTF8 encoding.

For example, trying to insert the 'degree' symbol, I've tried:

U&"\00b0"
E'\00b0'
"\u00b0"

In all cases, I simply get the literal string in the table, not the desired unicode character.

If I use them in an 'INSERT' statement, it works properly.  The problem is almost certainly between the chair and the keyboard, but what am I misunderstanding?

Re: psql '\copy to' and unicode escapes

От
"David G. Johnston"
Дата:
On Mon, Feb 26, 2018 at 9:53 AM, Steven Hirsch <snhirsch@gmail.com> wrote:
I fear that I'm missing something very obvious, but I cannot find a syntax that permits me to use an escaped hexadecimal representation in a CSV file and have that representation interpreted as the equivalent unicode character when inserting into the database.

​There isn't one - copy treats input as literals and performs basically no processing on them.​  The system writing the csv file would have to actually encode the UTF-8 symbol, not the string of the code point, directly into the document (i.e., a capable viewer would display whatever 00b0 is on-screen, or a placeholder if it is a non-printable character).

INSERT and COPY are two totally different animals:

INSERT INTO tbl (t) VALUES (trim('   jdjd   ')); -- stores jdjd, but putting trim('   jdjd   ') in a csv file and you would store "trim('   jdjd    ')"

David J.