Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

Поиск
Список
Период
Сортировка
От richard coleman
Тема Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Дата
Msg-id CAGA3vBvG2_W2GDKPRjuXeHhcvf1Ct-R-Zrf=YTDv3Y+qGDZqoQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (Dave Page <dpage@pgadmin.org>)
Ответы Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (Dave Page <dpage@pgadmin.org>)
Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3  (Doug Easterbrook <doug@artsman.com>)
Список pgadmin-support
Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgadmin-support по дате отправления:

Предыдущее
От: Khushboo Vashi
Дата:
Сообщение: Re: "Statistics" tab is very slow
Следующее
От: Dave Page
Дата:
Сообщение: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3