Обсуждение: Order by behaviour
Hi,
We have a PostgreSQL 8.0.6 cluster configured with lc_collate=pt_BR.UTF-8 and when we run the following SELECT:
SELECT substr(nomerazao,1,4),
ascii(substr(nomerazao,1,1)),
ascii(substr(nomerazao,2,1))
from spunico.unico order by nomerazao;
SELECT substr(nomerazao,1,4),
ascii(substr(nomerazao,1,1)),
ascii(substr(nomerazao,2,1))
from spunico.unico order by nomerazao;
is returning:
substr | ascii | ascii
--------+-------+-------
| 32 | 0
| 32 | 0
1000 | 49 | 48
1.DI | 49 | 46
1° R | 49 | 176
2M C | 50 | 77
3A.G | 51 | 65
A. A | 65 | 46
AABA | 65 | 65
A.A. | 65 | 46
A.AG | 65 | 46
A.A. | 65 | 46
A.A. | 65 | 46
ABAS | 65 | 66
ABAS | 65 | 66
ABAT | 65 | 66
A.B. | 65 | 46
A.B. | 65 | 46
ABCC | 65 | 66
A.B. | 65 | 46
A.B. | 65 | 46
--------+-------+-------
| 32 | 0
| 32 | 0
1000 | 49 | 48
1.DI | 49 | 46
1° R | 49 | 176
2M C | 50 | 77
3A.G | 51 | 65
A. A | 65 | 46
AABA | 65 | 65
A.A. | 65 | 46
A.AG | 65 | 46
A.A. | 65 | 46
A.A. | 65 | 46
ABAS | 65 | 66
ABAS | 65 | 66
ABAT | 65 | 66
A.B. | 65 | 46
A.B. | 65 | 46
ABCC | 65 | 66
A.B. | 65 | 46
A.B. | 65 | 46
Are not the lines out of order or is it a normal behaviour for a server with lc_collate=pt_BR.UTF-8?
Thank you in advance!
Reimer
On 28/03/2007 22:52, Carlos H. Reimer wrote: > SELECT substr(nomerazao,1,4), > ascii(substr(nomerazao,1,1)), > ascii(substr(nomerazao,2,1)) > from spunico.unico order by nomerazao; You need to add aliases to the returned column by which you'd like to order the result: your query is ordering the rows according to the column "nomerazao" in the original table, rather than by the substr() value returned. Do something like this: SELECT substr(nomerazao,1,4) AS my_col, ascii(substr(nomerazao,1,1)), ascii(substr(nomerazao,2,1)) from spunico.unico order by my_col; HTH Ray. --------------------------------------------------------------- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie ---------------------------------------------------------------
Carlos H. Reimer wrote: > Are not the lines out of order No. > or is it a normal behaviour for a > server with lc_collate=pt_BR.UTF-8? Yes. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Wed, 28 Mar 2007, Carlos H. Reimer wrote: > Hi, > > We have a PostgreSQL 8.0.6 cluster configured with lc_collate=pt_BR.UTF-8 > and when we run the following SELECT: > SELECT substr(nomerazao,1,4), > ascii(substr(nomerazao,1,1)), > ascii(substr(nomerazao,2,1)) > from spunico.unico order by nomerazao; > > is returning: > > substr | ascii | ascii > --------+-------+------- > | 32 | 0 > | 32 | 0 > 1000 | 49 | 48 > 1.DI | 49 | 46 > 1� R | 49 | 176 > 2M C | 50 | 77 > 3A.G | 51 | 65 > A. A | 65 | 46 > AABA | 65 | 65 > A.A. | 65 | 46 > A.AG | 65 | 46 > A.A. | 65 | 46 > A.A. | 65 | 46 > ABAS | 65 | 66 > ABAS | 65 | 66 > ABAT | 65 | 66 > A.B. | 65 | 46 > A.B. | 65 | 46 > ABCC | 65 | 66 > A.B. | 65 | 46 > A.B. | 65 | 46 > > Are not the lines out of order or is it a normal behaviour for a server with > lc_collate=pt_BR.UTF-8? Many collations ignore spaces and symbols on the first pass, so, for example you might have "A Z" > "AB" despite the fact that a space has a lower value than a B.
Humm, ok, it is clear now. And is there a way to change something in this behaviour, like not ignore spaces and some type of symbols? A configuration file or a patch? Thanks in advance! > -----Mensagem original----- > De: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]Em nome de Stephan Szabo > Enviada em: quarta-feira, 28 de março de 2007 19:23 > Para: Carlos H. Reimer > Cc: pgsql-general@postgresql.org > Assunto: Re: [GENERAL] Order by behaviour > > > On Wed, 28 Mar 2007, Carlos H. Reimer wrote: > > > Hi, > > > > We have a PostgreSQL 8.0.6 cluster configured with > lc_collate=pt_BR.UTF-8 > > and when we run the following SELECT: > > SELECT substr(nomerazao,1,4), > > ascii(substr(nomerazao,1,1)), > > ascii(substr(nomerazao,2,1)) > > from spunico.unico order by nomerazao; > > > > is returning: > > > > substr | ascii | ascii > > --------+-------+------- > > | 32 | 0 > > | 32 | 0 > > 1000 | 49 | 48 > > 1.DI | 49 | 46 > > 1° R | 49 | 176 > > 2M C | 50 | 77 > > 3A.G | 51 | 65 > > A. A | 65 | 46 > > AABA | 65 | 65 > > A.A. | 65 | 46 > > A.AG | 65 | 46 > > A.A. | 65 | 46 > > A.A. | 65 | 46 > > ABAS | 65 | 66 > > ABAS | 65 | 66 > > ABAT | 65 | 66 > > A.B. | 65 | 46 > > A.B. | 65 | 46 > > ABCC | 65 | 66 > > A.B. | 65 | 46 > > A.B. | 65 | 46 > > > > Are not the lines out of order or is it a normal behaviour for > a server with > > lc_collate=pt_BR.UTF-8? > > Many collations ignore spaces and symbols on the first pass, so, for > example you might have "A Z" > "AB" despite the fact that a space has a > lower value than a B. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > >
Carlos H. Reimer wrote: > Humm, ok, it is clear now. > > And is there a way to change something in this behaviour, like not ignore > spaces and some type of symbols? > > A configuration file or a patch? Pick a different locale? -- Richard Huxton Archonet Ltd
On Thu, 29 Mar 2007, Carlos H. Reimer wrote: > Humm, ok, it is clear now. > > And is there a way to change something in this behaviour, like not ignore > spaces and some type of symbols? Well, right now it's generally determined by your OS's definition of the locale you've chosen. You might be able to pick another locale which has different behavior if your system supports one or gives you a reasonable way to create one, or as the last ditch attempt, locale "C" for bytewise ordering. However, I think you'll need to recreate the database if you change the locale.
Hi, I was trying to find the docs about the collating sequence standards but could not find. Would like to know for example which characters are ignored by the "order by" in some of the collating types. Please, can anyone indicate me where could I find documentation about these standards? Thanks in advance! > -----Mensagem original----- > De: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] > Enviada em: quarta-feira, 28 de março de 2007 19:23 > Para: Carlos H. Reimer > Cc: pgsql-general@postgresql.org > Assunto: Re: [GENERAL] Order by behaviour > > > On Wed, 28 Mar 2007, Carlos H. Reimer wrote: > > > Hi, > > > > We have a PostgreSQL 8.0.6 cluster configured with > lc_collate=pt_BR.UTF-8 > > and when we run the following SELECT: > > SELECT substr(nomerazao,1,4), > > ascii(substr(nomerazao,1,1)), > > ascii(substr(nomerazao,2,1)) > > from spunico.unico order by nomerazao; > > > > is returning: > > > > substr | ascii | ascii > > --------+-------+------- > > | 32 | 0 > > | 32 | 0 > > 1000 | 49 | 48 > > 1.DI | 49 | 46 > > 1° R | 49 | 176 > > 2M C | 50 | 77 > > 3A.G | 51 | 65 > > A. A | 65 | 46 > > AABA | 65 | 65 > > A.A. | 65 | 46 > > A.AG | 65 | 46 > > A.A. | 65 | 46 > > A.A. | 65 | 46 > > ABAS | 65 | 66 > > ABAS | 65 | 66 > > ABAT | 65 | 66 > > A.B. | 65 | 46 > > A.B. | 65 | 46 > > ABCC | 65 | 66 > > A.B. | 65 | 46 > > A.B. | 65 | 46 > > > > Are not the lines out of order or is it a normal behaviour for > a server with > > lc_collate=pt_BR.UTF-8? > > Many collations ignore spaces and symbols on the first pass, so, for > example you might have "A Z" > "AB" despite the fact that a space has a > lower value than a B. > >
On Thu, Apr 05, 2007 at 06:46:27PM -0300, Carlos H. Reimer wrote: > Hi, > > I was trying to find the docs about the collating sequence standards but > could not find. > > Would like to know for example which characters are ignored by the "order > by" in some of the collating types. There are no standards. The collation used by the postgres is whatever collation is done by your OS. So you need to check there... Loosely defined, its sorts the same way as "sort" does. So you can test by doing things like: LC_COLLATE=foo sort <bar Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.