Обсуждение: Backup messages displayed with wrong encoding
Hello, I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS Win1251) with Postgresql 9.1.2. Having database with UTF-8 encoding and objects, those names contain non-ASCII (Russian) characters, I get non-readable object names when performing backup in pgAdmin3. I think it's happened because pgAdmin assumes that pg_dump always streams output using current locale encoding. But it's not the case, cause it can be a database encoding (by default) or the encoding specified explicitly in frmBackup. Thanks, Alexander
On Mon, 2011-12-12 at 09:49 +0400, Alexander LAW wrote: > Hello, > > I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS > Win1251) with Postgresql 9.1.2. > Having database with UTF-8 encoding and objects, those names contain > non-ASCII (Russian) characters, I get non-readable object names when > performing backup in pgAdmin3. > You mean when you restore it? pgAdmin is UTF-8 only but it accepts to use other encodings to do the dump. > I think it's happened because pgAdmin assumes that pg_dump always > streams output using current locale encoding. But it's not the case, > cause it can be a database encoding (by default) or the encoding > specified explicitly in frmBackup. > pgAdmin doesn't assume anything. It simply launches pg_dump with the options you set in the frmBackup dialog. Righ now, I cannot say wher the issue is. I would need more info to guess that. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com
Hi, To make it clear I am posting two screenshots. ss_backup_win1251 shows valid table name (which is "Test" in Russian), but in ss_backup_utf8 you can see the name with wrong encoding. When I said "pgAdmin assumes", I meant that it converts pg_admin output stream to string as ANSI-encoded, but it's not always the case. In fact, the opposite is common on Windows with Russian locale (and non-ASCII object names), cause UTF-8 is a default encoding for a database, but locale encoding (SBCS) is Win1251, and when you do backup with a default encoding, you get an unreadable log. Best regards, Alexander 13.12.2011 00:13, Guillaume Lelarge wrote: > On Mon, 2011-12-12 at 09:49 +0400, Alexander LAW wrote: >> Hello, >> >> I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS >> Win1251) with Postgresql 9.1.2. >> Having database with UTF-8 encoding and objects, those names contain >> non-ASCII (Russian) characters, I get non-readable object names when >> performing backup in pgAdmin3. >> > You mean when you restore it? pgAdmin is UTF-8 only but it accepts to > use other encodings to do the dump. > >> I think it's happened because pgAdmin assumes that pg_dump always >> streams output using current locale encoding. But it's not the case, >> cause it can be a database encoding (by default) or the encoding >> specified explicitly in frmBackup. >> > pgAdmin doesn't assume anything. It simply launches pg_dump with the > options you set in the frmBackup dialog. Righ now, I cannot say wher the > issue is. I would need more info to guess that. > >
Вложения
On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote: > Hi, > To make it clear I am posting two screenshots. > ss_backup_win1251 shows valid table name (which is "Test" in Russian), > but in ss_backup_utf8 you can see the name with wrong encoding. > > When I said "pgAdmin assumes", I meant that it converts pg_admin output > stream to string as ANSI-encoded, but it's not always the case. > In fact, the opposite is common on Windows with Russian locale (and > non-ASCII object names), cause UTF-8 is a default encoding for a > database, but locale encoding (SBCS) is Win1251, and when you do backup > with a default encoding, you get an unreadable log. > pgAdmin simply displays what pg_dump gives him. If it's in the right encoding, you'll see your tables' name correct. I'm not sure it would be a good idea to grab every line and to convert them in whatever encoding pgAdmin would like. If it's possible at all. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com
Hello,
I don't think that Win1251 encoding is more right than UTF-8. IMHO, a
program should understand what encoding is text in or let me choose the
encoding to read the text. If you would consider this behavior as a
problem (for the conditions described) you could solve it by providing a
combobox in the Messages tab, that lets me choose the encoding of the
log. But then I will choose there the same encoding as I did before in
the File Options tab or the encoding of the database. So pgAdmin knows
which encoding to use when reading the pg_admin output stream.
And about the need and possibility of conversion, I believe that the
every line of the log is converted anyway. Please look at
sysProcess:ReadStream, there you have strings read from input and
appended to txtMessages.
str.Append(wxString::Format(wxT("%s"), wxString(buffer,
wxConvLibc).c_str()));
As I understand, wxConvLibc here specifies that the input strings always
are in OS locale encoding, but IMO this should depend on the backup
encoding.
Best regards
14.12.2011 00:20, Guillaume Lelarge пишет:
> On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote:
>> Hi,
>> To make it clear I am posting two screenshots.
>> ss_backup_win1251 shows valid table name (which is "Test" in Russian),
>> but in ss_backup_utf8 you can see the name with wrong encoding.
>>
>> When I said "pgAdmin assumes", I meant that it converts pg_admin output
>> stream to string as ANSI-encoded, but it's not always the case.
>> In fact, the opposite is common on Windows with Russian locale (and
>> non-ASCII object names), cause UTF-8 is a default encoding for a
>> database, but locale encoding (SBCS) is Win1251, and when you do backup
>> with a default encoding, you get an unreadable log.
>>
> pgAdmin simply displays what pg_dump gives him. If it's in the right
> encoding, you'll see your tables' name correct. I'm not sure it would be
> a good idea to grab every line and to convert them in whatever encoding
> pgAdmin would like. If it's possible at all.
>
>
Hi,
I would like to clarify the issue with the encoding.
Now that we've got postgres localized (with russian messages), I can see
that the encoding of the pg_dump output not changed depending on the
database setting or the pg_dump option.
It seems that pg_dump changes only the encoding of a database objects
names (see the screenshot).
So it looks like a pg_dump bug, not a feature.
Thanks for your feedback.
Best regards,
Alexander
14.12.2011 09:12, Alexander LAW пишет:
> Hello,
> I don't think that Win1251 encoding is more right than UTF-8. IMHO, a
> program should understand what encoding is text in or let me choose
> the encoding to read the text. If you would consider this behavior as
> a problem (for the conditions described) you could solve it by
> providing a combobox in the Messages tab, that lets me choose the
> encoding of the log. But then I will choose there the same encoding as
> I did before in the File Options tab or the encoding of the database.
> So pgAdmin knows which encoding to use when reading the pg_admin
> output stream.
> And about the need and possibility of conversion, I believe that the
> every line of the log is converted anyway. Please look at
> sysProcess:ReadStream, there you have strings read from input and
> appended to txtMessages.
>
> str.Append(wxString::Format(wxT("%s"), wxString(buffer,
> wxConvLibc).c_str()));
>
> As I understand, wxConvLibc here specifies that the input strings
> always are in OS locale encoding, but IMO this should depend on the
> backup encoding.
>
> Best regards
>
> 14.12.2011 00:20, Guillaume Lelarge пишет:
>> On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote:
>>> Hi,
>>> To make it clear I am posting two screenshots.
>>> ss_backup_win1251 shows valid table name (which is "Test" in Russian),
>>> but in ss_backup_utf8 you can see the name with wrong encoding.
>>>
>>> When I said "pgAdmin assumes", I meant that it converts pg_admin output
>>> stream to string as ANSI-encoded, but it's not always the case.
>>> In fact, the opposite is common on Windows with Russian locale (and
>>> non-ASCII object names), cause UTF-8 is a default encoding for a
>>> database, but locale encoding (SBCS) is Win1251, and when you do backup
>>> with a default encoding, you get an unreadable log.
>>>
>> pgAdmin simply displays what pg_dump gives him. If it's in the right
>> encoding, you'll see your tables' name correct. I'm not sure it would be
>> a good idea to grab every line and to convert them in whatever encoding
>> pgAdmin would like. If it's possible at all.
>>
>>
>