Обсуждение: Re: [pgsql-hackers-win32] UNICODE/UTF-8 on win32

Поиск
Список
Период
Сортировка

Re: [pgsql-hackers-win32] UNICODE/UTF-8 on win32

От
"Magnus Hagander"
Дата:
>I do understand the problem, but don't undertstand the decision you
>guys made. The fact that UPPER/LOWER and some other functions does not
>work in win32 is surely a problem for some languages, but not a
>problem for otheres. For example, Japanese (and probably Chinese and
>Korean) does not have a concept upper/lower. So the fact UPPER/LOWER
>does not work with UTF-8/win32 is not problem for Japanese (and for
>some other languages). Just using C locale with UTF-8 is enough in
>this case.

The main issue is not with upper/lower, it's with ORDER BY (and doesn't
that affect indexes as well). This affects Japanese as well, no?

I didn't consider the C locale. Do you know for a fact that it works
there on win32 as well, or is that an assumption? (I don't know either
way)


>In summary, I think you guys are going to overkill the multibyte
>support functionality on UTF-8/win32 because of the fact that some
>langauges do not work.

I was under the impression that *no* languages worked. If some do work,
then we definitly should not kill it.

It would be good to have some way of detecting if it worked or not at
the time of creation of the database. But I have no idea on how to do
that in a reasonable way.


//Magnus

Re: [pgsql-hackers-win32] UNICODE/UTF-8 on win32

От
Tom Lane
Дата:
"Magnus Hagander" <mha@sollentuna.net> writes:
> I didn't consider the C locale. Do you know for a fact that it works
> there on win32 as well, or is that an assumption?

It should work.  The only use of strcoll() in the backend is in
varstr_cmp which uses strncmp() instead for C locale.  Lack of
working upper/lower is hardly a fatal objection, considering that
we never had that for UTF8 before 8.0 anyway.  But you do have to
have working varstr_cmp.

> It would be good to have some way of detecting if it worked or not at
> the time of creation of the database. But I have no idea on how to do
> that in a reasonable way.

At this point I'd say that any combination of UTF8 encoding with a non
C/POSIX locale probably isn't going to work on Windows.  Tatsuo, do you
know of other cases that will work?

            regards, tom lane

Re: [pgsql-hackers-win32] UNICODE/UTF-8 on win32

От
Tatsuo Ishii
Дата:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > I didn't consider the C locale. Do you know for a fact that it works
> > there on win32 as well, or is that an assumption?
>
> It should work.  The only use of strcoll() in the backend is in
> varstr_cmp which uses strncmp() instead for C locale.  Lack of
> working upper/lower is hardly a fatal objection, considering that
> we never had that for UTF8 before 8.0 anyway.  But you do have to
> have working varstr_cmp.
>
> > It would be good to have some way of detecting if it worked or not at
> > the time of creation of the database. But I have no idea on how to do
> > that in a reasonable way.
>
> At this point I'd say that any combination of UTF8 encoding with a non
> C/POSIX locale probably isn't going to work on Windows.  Tatsuo, do you
> know of other cases that will work?

No. I think C is the only working locale.
--
Tatsuo Ishii

Re: [pgsql-hackers-win32] UNICODE/UTF-8 on win32

От
Tatsuo Ishii
Дата:
> >I do understand the problem, but don't undertstand the decision you
> >guys made. The fact that UPPER/LOWER and some other functions does not
> >work in win32 is surely a problem for some languages, but not a
> >problem for otheres. For example, Japanese (and probably Chinese and
> >Korean) does not have a concept upper/lower. So the fact UPPER/LOWER
> >does not work with UTF-8/win32 is not problem for Japanese (and for
> >some other languages). Just using C locale with UTF-8 is enough in
> >this case.
>
> The main issue is not with upper/lower, it's with ORDER BY (and doesn't
> that affect indexes as well). This affects Japanese as well, no?

As long as used with C locale, indexes should be ok. ORDER BY is not
perfect but we can live with it. Since Japanese is an ideogram, we
cannot rely on ORDER BY character codes to sort Japanese characters
anyway. I believe same thing can be said to Chinese.

> I didn't consider the C locale. Do you know for a fact that it works
> there on win32 as well, or is that an assumption? (I don't know either
> way)

I have not tested 8.0 on win32, but I think it should work with C
locale since I know PowerGres, which is based on 7.4, works.

> >In summary, I think you guys are going to overkill the multibyte
> >support functionality on UTF-8/win32 because of the fact that some
> >langauges do not work.
>
> I was under the impression that *no* languages worked. If some do work,
> then we definitly should not kill it.
>
> It would be good to have some way of detecting if it worked or not at
> the time of creation of the database. But I have no idea on how to do
> that in a reasonable way.
--
Tatsuo Ishii