Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

Поиск
Список
Период
Сортировка
От Dave Page
Тема Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Дата
Msg-id CA+OCxoy4gZLf+M2ES75i1PLv8eZo_FhxAm2UvUau6-kA57iMEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (richard coleman <rcoleman.ascentgl@gmail.com>)
Список pgadmin-support
Hi

On Tue, Jan 8, 2019 at 7:32 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> I would imagine Nanina would be in a better position to provide you with problematic import/export data in the short
term. I don't tend to import/export that often these days, preferring to use SQL statements for most things short of a
fullbackup/restore (in my case I've found it to be much less picky). As mentioned previously, in my experience the
charactersthat tend to trip up pgAdmin4 are Windows special characters.  I would imagine the upper Windows-1252
characterset as being particularly problematic for pgAdmin4 if it is expecting proper UTF-8 (i.e.
ŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™). This would explain why Windows ODBC, .Net, and pSQL have no problems dealing with the
data. I would imagine if it the database was set up with  ENCODING =  'WIN1252' then postgreSQL would do the
translationinto UTF-8 for pgAdmin4, but since it isn't postgreSQL can't provide pgAdmin4 with any help. 

Right - and that's kinda the point. PostgreSQL is a database that is
designed to enforce integrity rules on your data, whether those be
around encoding, or table constraints, strong typing, foreign keys
etc. Those strengths are amongst the reasons most of us chose it in
the first place, rather than one of the NoSQL databases that are
usually much more forgiving in many respects.

> It's up to pgAdmin4 to deal with the otherwise valid data appropriately.

How can it? If there is no encoding specified (because you're using
SQL_ASCII, with values > 127) the behaviour is undefined by
definition. Any attempts to deal with such data will be hit and miss
because there is no possible way for pgAdmin to know how the data is
supposed to be interpreted. You know your data is Win1252 encoded, but
for all pgAdmin knows, it could be Win1253.

The best option is to use the correct encoding for the database, or if
you have data that really doesn't conform to any encoding standard,
use the bytea datatype.

Anyway, I've said my piece. I'll go investigate the workaround in a
moment and report back.

> I hope your workaround pans out, until then I will spend my time at the psql prompt, or if the data is needed
elsewhererun the two functions I had included previously to identify and remove the offensive characters. 
>
> Here's the create database script for one of my databases, perhaps it can shed some light (it was originally an 8.3
postgreSQLdatabase {long before my time here, currently running under postgreSQL 10.x} and apparently back then it
defaultedto creating SQL_ASCII encoded databases on Windows). 
>
>> CREATE DATABASE tms_production
>>     WITH
>>     OWNER = local_user
>>     ENCODING = 'SQL_ASCII'
>>     LC_COLLATE = 'English_United States.1252'
>>     LC_CTYPE = 'English_United States.1252'
>>     TABLESPACE = pg_default
>>     CONNECTION LIMIT = -1;
>> ALTER DATABASE tms_production
>>     SET default_transaction_read_only TO off;
>> ALTER DATABASE tms_production
>>     SET client_encoding TO sql_ascii;
>> ALTER DATABASE tms_production
>>     SET standard_conforming_strings TO off;

Thanks!


> On Tue, Jan 8, 2019 at 8:37 AM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi Rik
>>
>> On Tue, Jan 8, 2019 at 6:53 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8
datain a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO
problemdealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact
themost common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid
non-UTF8characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be
thatpgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters. 
>> >
>> > I hope I have made the situation a little bit clearer.
>>
>> Well psql is failing to deal with it *in this case*, as that's what is
>> doing the \copy in the import/export tool.
>>
>> In other cases (i.e. the ones where pgAdmin sees the data, such as
>> results in the query tool), the issue arises because Python and/or
>> Javascript (and by extension pgAdmin) may barf on data encoded in a
>> way they don't recognise. That's why the PostgreSQL docs say to only
>> use ASCII data in SQL_ASCII databases - the behaviour is undefined,
>> and as a result may either not render properly or may crash or error
>> on non-ASCII data.
>>
>> Anyhoo, I expect to have a little time after dinner shortly so I'll
>> try out the workaround I thought of earlier to see if it helps (I
>> doubt it'll be a panacea, but it may help in some cases).
>>
>> By any chance do you have a test case you can share with me that
>> refuses to export from pgAdmin (using the Import/Export tool)? If so,
>> I'd appreciate a copy of it to play with.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgadmin-support по дате отправления:

Предыдущее
От: Dave Page
Дата:
Сообщение: Re: Change browser in MacOS
Следующее
От: Dave Page
Дата:
Сообщение: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3