Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

Поиск
Список
Период
Сортировка
От richard coleman
Тема Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Дата
Msg-id CAGA3vBuJSQxxjNd4tkCqNyXhenw3jipR96AGoGUdHn57x8g4qA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (Dave Page <dpage@pgadmin.org>)
Ответы Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (Dave Page <dpage@pgadmin.org>)
Список pgadmin-support
Dave, 

I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.

Just my $0.02.

rik.





On Mon, Jan 7, 2019 at 12:49 PM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Jan 7, 2019 at 9:11 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Nania,
>
> Welcome to the wonderful world of pgAdmin4.  I have been bitten often by this particular shortcoming in pgAdmin4. :(  My issue seems to stems from the fact that I use ASCII tables as a back end for a .Net windows application and perfectly valid windows (Word/Excel) characters cause pgAdmin4 no end of issues.

pgAdmin has nothing to do with this. It is simply calling PostgreSQL's
psql utility, and telling it to import or export the file. The
database server is then throwing the error seen.

> My solution (with the help of some fine people on the postgres IRC channel) is to run a couple of functions on my tables/fields to locate and clean the offending characters out.  Of course, if you need those characters, then this won't actually help.  Here they are in the advent that they might prove helpful/adaptable to your situation.

The problem with that is that you're trying to fix something that's
basically broken to begin with. From the PostgreSQL docs
(https://www.postgresql.org/docs/current/multibyte.html):

----
The SQL_ASCII setting behaves considerably differently from the other
settings. When the server character set is SQL_ASCII, the server
interprets byte values 0-127 according to the ASCII standard, while
byte values 128-255 are taken as uninterpreted characters. No encoding
conversion will be done when the setting is SQL_ASCII. Thus, this
setting is not so much a declaration that a specific encoding is in
use, as a declaration of ignorance about the encoding. In most cases,
if you are working with any non-ASCII data, it is unwise to use the
SQL_ASCII setting because PostgreSQL will be unable to help you by
converting or validating non-ASCII characters.
----


> Finds what pgAdmin4 considers bad UTF8:
> CREATE OR REPLACE FUNCTION live.is_utf8(
> text)
>     RETURNS boolean
>     LANGUAGE 'sql'
>
>     COST 100
>     VOLATILE
> AS $BODY$
>     select encode(convert_to($1,'SQL_ASCII'),'hex')
>            ~ $r$(?x)
>                 ^(?:(?:[0-7][0-9a-f])
>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>                       |e0[ab][0-9a-f]
>                       |ed[89][0-9a-f]
>                       |(?:(?:e[1-9abcef])
>                          |f0[9ab][0-9a-f]
>                          |f[1-3][89ab][0-9a-f]
>                          |f48[0-9a-f]
>                        )[89ab][0-9a-f]
>                     )[89ab][0-9a-f]
>                  )*$
>              $r$;
> $BODY$;
>
> ALTER FUNCTION live.is_utf8(text)
>     OWNER TO postgres;
>
>
> Fixes what pgAdmin4 considers to be bad UTF8:
>>
>> CREATE OR REPLACE FUNCTION live.badutf8(
>> text)
>>     RETURNS text
>>     LANGUAGE 'sql'
>>     COST 100
>>     VOLATILE
>> AS $BODY$
>>     select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
>>              $r$(?x)
>>                  (?:(?:[0-7][0-9a-f])
>>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>>                       |e0[ab][0-9a-f]
>>                       |ed[89][0-9a-f]
>>                       |(?:(?:e[1-9abcef])
>>                          |f0[9ab][0-9a-f]
>>                          |f[1-3][89ab][0-9a-f]
>>                          |f48[0-9a-f]
>>                        )[89ab][0-9a-f]
>>                     )[89ab][0-9a-f]
>>                  )*(..)?
>>              $r$, '-\1-', 'g')
>> $BODY$;
>> ALTER FUNCTION live.badutf8(text)
>>     OWNER TO postgres;
>
>
> Fixes bad UTF8
>
>
>
> On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.tron@icloud.com> wrote:
>>
>> Hi,
>>
>> I am pretty new to PostgreSQL so I might just miss something basic here.
>>
>> My problem is that, I cannot import or export some of the tables in my db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and imported via pgAdmin3. The strange thing is that it can still be imported and exported with pgAdmin3 but not with pgAdmin4. The db was created with encoding UTF-8, the .csv files where created with encoding UTF-8 and also the import/export dialog is set to UTF-8. Queries are also no problem on these tables so it seems to me that this could be a client problem.
>>
>> I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained with pgAdmin4). Locally I am working on a Windows 7 Professional (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3.
>>
>> I did not find any hint of the same problem on my Google or archive search, so I would be very grateful for any idea what I am doing wrong here.
>>
>> Best,
>>
>> Nanina



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgadmin-support по дате отправления:

Предыдущее
От: Dave Page
Дата:
Сообщение: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Следующее
От: Dave Page
Дата:
Сообщение: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3