Re: locale
| От | Dennis Bjorklund | 
|---|---|
| Тема | Re: locale | 
| Дата | |
| Msg-id | Pine.LNX.4.44.0404081729510.4551-100000@zigo.dhs.org обсуждение исходный текст | 
| Ответ на | Re: locale (Tom Lane <tgl@sss.pgh.pa.us>) | 
| Ответы | Re: locale | 
| Список | pgsql-hackers | 
On Thu, 8 Apr 2004, Tom Lane wrote: > No, the ordering *will* be the same as it was before, because strcoll() > is still functioning the same. You'd get the same answer from a sort > operation since it depends on the same operators. > > It interprets them according to LC_CTYPE, which does not change. I'm afraid that I don't understand you yet, and would like to have it explained in more detail if possible. While I feel a bit stupid to not understand what you are stating, but I'm sure there are more then me who feels like that :-) Maybe we can look at an example. Let us take some utf-8 strings correctly ordered in swedish Åke Ära now, since these are utf-8 they are encoded as c3 85 6b 65 (Åke) c3 84 72 61 (Ära) and that is the order they have in the index. Now, this index is copied into a new database where the encoding is Latin1. Now we want to in the above table lookup the string that in Latin1 is represented as c3 84 72 61 So we look in the index and see that the first row in the index is not the same. But, now when we compare these strings as latin1 strings it's no longer the case that c3 84 72 61 > c3 85 6b 65. As latin1 strings we compare each character and c3 = c3, and then 84 < 85 (in latin1 84 and 85 are some control characters). Se, we will not find this string in the index since we think it should have been before the first entry. We might even insert a new copy of this string in another position in the index. So, my question is. a) What have we gained by copying this table into the latin1 database. It looks broken to me. As far as I understand wehave to rebuild the index to get something that works at least a little. b) Maybe one should not just reindex but reencode. In some cases that works and produces good result. For example from latin1to utf-8. c) if we are going to reindex anyway, then why not do that and solve the per database locale also. This is an independentpoint from a) and b) that I still want to understand the first two points even if we don't talk about per databaselocale. -- /Dennis Björklund
В списке pgsql-hackers по дате отправления: