Re: different sort order in windows and linux version
От | Tomi NA |
---|---|
Тема | Re: different sort order in windows and linux version |
Дата | |
Msg-id | d487eb8e0606301029k7a217d41p45e269d23ad200f2@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: different sort order in windows and linux version (Martijn van Oosterhout <kleptog@svana.org>) |
Ответы |
Re: different sort order in windows and linux version
Re: different sort order in windows and linux version |
Список | pgsql-general |
On 6/30/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Fri, Jun 30, 2006 at 11:56:19AM +0200, Dragan Matic wrote: > > I have two postgres servers, one on linux (fedora core 5), one on > > windows, both are version 8.1.4. > > > > Both databases are initialized with locale Croatian and win1250 encoding. > > > > running pg_controldata on windows returns this > > > > LC_COLLATE: Croatian_Croatia.1250 > > LC_CTYPE: Croatian_Croatia.1250 > > > > the same command on linux returns this > > > > LC_COLLATE: hr_HR > > LC_CTYPE: hr_HR > > > > which is the same, I suppose. > > Well, apparently not. Postgres makes no attempt to understand > collations nor try to determine whether they make sense. If you want to > have the same collation on Windows and Linux, I think you're going to > have trouble. Croatian_Croatia and hr_HR are, in fact, the same in that there is no other collation for the Croatian language. Whatsmore, Dragan ran the test using characters which are encoded exactly the same in cp1250, utf8, iso8859-2, hell, probably even us-ascii. The fact remains that different OSes collate differently, even for the same locale. In C++, people use things like GTK, wxWidgets and GCL so that they could think about "C++ code instead of the platform they're coding on. In Java, people use things like File.separator instead of "\" or "/" so that they could think about "Java code". There are dozens of examples like these and most of the exceptions stem from the influence of the at the time monopoly-holder. When you code in the RDBMS environment, you want to code in terms of pgsql or Oracle or MySQL or whatever: you don't want to program for Oracle on Solaris vs. Oracle on Linux vs. Oracle on Plan9 or...well, you get the idea. Not beeing able to depend on the engine to consistently collate strings as simple as the ones Dragan listed is closer to a serious bug (non-deterministic behaviour in otherwise deterministic functions) than a RFE, but is certainly nowhere near "it's not our problem" as it regularly seems made up to be. The OS(es) simply and obviously do(es)n't do a good enough job of it. > In the past there have existed patches to allow postgres to use ICU for > locale support. It's supposedly not quite as fast, but you will be able > get consistant results across platforms. Personally, I'd be perfectly happy with pgsql if I could choose to make text operations up to 2-3x slower without the fuss of how it's going to work on a certain platform, in each pgsql version. Furthermore, compiling the server myself is not an option for live usage: on my current project, I'm not even the one installing the database servers...sending administrators a binary I configured and compiled (on Windows, in this case!) and noone but me tested...brrrr...I get the shivers just thinking about it. If I sound harsh, please excuse me, but I feel like I'm the only one who thinks these encoding problems (collation, upper/lowercase, multiple languages in a single database) are serious...nobody seems to share the sentiment. Ah well... t.n.a.
В списке pgsql-general по дате отправления: