BUG #17501: COPY is failing with "ERROR: invalid byte sequence for encoding "UTF8": 0xe5"

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17501: COPY is failing with "ERROR: invalid byte sequence for encoding "UTF8": 0xe5"
Дата
Msg-id 17501-128b1dd039362ae6@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17501: COPY is failing with "ERROR: invalid byte sequence for encoding "UTF8": 0xe5"  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17501
Logged by:          Vitaly Voronov
Email address:      wizard_1024@tut.by
PostgreSQL version: 14.3
Operating system:   CentOS Linux release 7.9.2009 (Core)
Description:

Hello,

We've seen a such bug: COPY command shows error "ERROR:  invalid byte
sequence for encoding "UTF8": 0xe5" on file.
The same file with small amount of lines is imported without any errors.

How to reproduce bug:
# create database
# create database with
# SQL_ASCII, C, C
createdb --encoding=SQL_ASCII --lc-collate=C --lc-ctype=C
--template=template0 test

# connect to the database
psql test

# Create table
CREATE TABLE test_data (
    test_data text
);

# Import without error
truncate table test_data;
COPY test_data (test_data) FROM '/tmp/test_pass.csv' WITH DELIMITER AS ','
CSV QUOTE AS '"';

COPY 207

# Import with error
truncate table test_data;
COPY test_data (test_data) FROM '/tmp/test_fail.csv' WITH DELIMITER AS ','
CSV QUOTE AS '"';

ERROR:  invalid byte sequence for encoding "UTF8": 0xe5
CONTEXT:  COPY test_data, line 627

# both files contains the same rows, but test_fail contains more rows
# seems that the file more than 65K size cannot be imported
# if create DB with UTF8 encoding instead of SQL_ASCII - both tests will be
passed

# How to generate files:
# Imported without errors
for i in $(seq 1 207); do echo

"NURO光です。明日の宅内工事お立合いよろしくお願い致します。2回目の屋外工事につきましては具体的な工事日案内の準備が整い次第、こちらからご連絡いたします。※詳細はこちら【工事について】https://www.test.jp/1234/5678.html&id=12211"
>> /tmp/test_pass.csv; done;
# Imported with errors
for i in $(seq 1 5722); do echo

"NURO光です。明日の宅内工事お立合いよろしくお願い致します。2回目の屋外工事につきましては具体的な工事日案内の準備が整い次第、こちらからご連絡いたします。※詳細はこちら【工事について】https://www.test.jp/1234/5678.html&id=12211"
>> /tmp/test_fail.csv; done;

# Both files can be imported without any problem to PostgreSQL 11.


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17502: View based on window functions returns wrong results when queried