Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92

Поиск
Список
Период
Сортировка
1. If postgresql always does this I am not sure how it is working in 9.1.

2. I am able to view the content in pgadmin, why would the server not complain when viewing from pgadmin. Not sure what the client encoding would be when using pgadmin.

3. Thought SQL_ASCII means the text is stored as ASCII format, may be I am way off base on this. Is there a way to check the notes column for invalid UTF8 characters and remove them?

4. Based on the error there is invalid data in database so the database should have allowed this invalid data to come in some how right?

Any suggestions as to how to get this working in 9.4? Right now the database is in 9.1 and working fine so may be there is a way to fix these issues before doing a dump for 9.4?

Thanks,
Prasanth

On 8/8/2015 2:40 PM, Prasanth Reddy wrote:
> 1. Is the content vetting new in 9.4?

no, PostgreSQL has always been strict about content types.

> 2. Is there an option to ask the driver not to vet the content? Some times when the users copy notes from a word document there will be some characters invalid in UTF8. Most of these should only be in
> notes columns.

the driver isn't doing this, its happening in the postgres
database.       you should fix invalid input at the source application
before sending to the database.

> 3. Is there a way to check the content and fix it/remove invalid characters?

define 'invalid character', then fix it.    this SQL_ASCII field that
contains 0x92, what character code is that supposed to represent ?

> 4. Is there an option to force the driver to not update database with invalid characters?

the database (NOT the driver) does that already.  if you try and insert
invalid data, you get an error and have to rollback the transaction.


--
john r pierce, recycling bits in santa cruz

On 08/08/2015 04:40 PM, Prasanth Reddy wrote:



-------- Forwarded Message --------
Subject: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
Date: Sat, 08 Aug 2015 16:39:17 -0500
From: Prasanth Reddy <dbadmin@nqadmin.com>
Reply-To: dbadmin@nqadmin.com
To: Bosco Rama <postgres@boscorama.com>


1. Is the content vetting new in 9.4?
2. Is there an option to ask the driver not to vet the content? Some times when the users copy notes from a word document there will be some characters invalid in UTF8. Most of these should only be in
notes columns.
3. Is there a way to check the content and fix it/remove invalid characters?
4. Is there an option to force the driver to not update database with invalid characters?

Appreciate your help.

Thanks,
Prasanth

------------------Original Message--------------------------

On 08/08/15 10:49, Prasanth Reddy wrote:
>
[snip]
> 12:42:02.505 (1)  <=BE ParameterStatus(client_encoding = UTF8)
[snip]
> 12:42:02.506 (1)  <=BE ParameterStatus(server_encoding = SQL_ASCII)

There's your problem.  The client is expecting UTF8 while the server
is storing un-vetted SQL_ASCII.  You will need to change one of the
components' choice of encoding and/or enforce content conformance in the
data.  Another choice, in cases of known and expected non-compliance, is
to use bytea casting to bypass the automated content vetting.  But your
choice in this case is going to be driven by the ease with which you can
alter your application vs the server-side changes that would be needed.

HTH,
Bosco.




В списке pgsql-jdbc по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
Следующее
От: Mark Rotteveel
Дата:
Сообщение: Re: Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92