Re: encoding confusion with \copy command

Поиск
Список
Период
Сортировка
От Martin Waite
Тема Re: encoding confusion with \copy command
Дата
Msg-id CAOWKics9d+G9V7AF4UdnJQpbDLxSfwTekAC_rCuFdTykEt+5PQ@mail.gmail.com
обсуждение исходный текст
Ответ на encoding confusion with \copy command  (Martin Waite <waite.134@gmail.com>)
Ответы Re: encoding confusion with \copy command  (Adrian Klaver <adrian.klaver@aklaver.com>)
Список pgsql-general
Hi Adrian,

I apologise - I meant 9.4

regards,
Martin

On 17 September 2014 14:35, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 09/17/2014 03:03 AM, Martin Waite wrote:
Hi,

I have a postgresql 7.4 server and client on Centos 6.4.  The database
server is using UTF-8 encoding.

First I think we need to establish what version of Postgres you using. Are you really using 7.4?


I have been exploring the use of the \copy command for importing CSV
data generated by SQL Server 2008.  SQL Server 2008 export tool does not
escape quotes that are in the content of fields, and so it is useful to
be able to specify obscure characters in the quote option in the \copy
command to work around this issue.

When I run the following commands in psql, I am surprised that QUOTE is
limited to characters in the range 0x01 - 0x7f, and that UTF8 is
mentioned in the error message if characters outside the range are chosen:

    \encoding WIN1252
    \copy yuml from '/tmp/yuml.csv'  WITH CSV HEADER ENCODING 'WIN1252'
    QUOTE as E'\xff';
    ERROR:  invalid byte sequence for encoding "UTF8": 0xff


If you are actually on Postgres 7.4 the above would not be a viable command.




I thought that if the client (psql) is WIN1252, and the CSV file is
specified as WIN1252, then I could specify any valid WIN1252 character
as the quote character.   Instead, I am limited to the range of
characters that can be encoded as a single byte in UTF-8. Actually, 0x00
is not accepted either, so the range is 0x01 - 0x7F.

Is this a bug or expected behaviour ?

Is it the case that the server does the actual CSV parsing, and that
given that my server is in UTF8, I am therefore limited to single-byte
UTF8 characters ?

Actually depending on version you may be limited to ASCII.


regards,
Martin


--
Adrian Klaver
adrian.klaver@aklaver.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: pg_multixact issues
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: encoding confusion with \copy command