Обсуждение: Order by and strings
Hi!
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
I would expect a string that start with a hyphen to be sorted before or after 'a' and 'c' and not between them. I have tried with a few other characters (space, opening parenthesis, etc) but the result is the same.
What I want is the strings sorted by their ascii (or UTF-8) values, without some "smart" heuristic. How do I accomplish this?
I cannot find this described in the manual (it should be there!) or on the net.
/Fredric
PS. 8.2, Fedora Linux
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
I would expect a string that start with a hyphen to be sorted before or after 'a' and 'c' and not between them. I have tried with a few other characters (space, opening parenthesis, etc) but the result is the same.
What I want is the strings sorted by their ascii (or UTF-8) values, without some "smart" heuristic. How do I accomplish this?
I cannot find this described in the manual (it should be there!) or on the net.
/Fredric
PS. 8.2, Fedora Linux
Вложения
On 2/8/2010 7:09 PM, Fredric Fredricson wrote: Hi!
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
It has to do with the collation you are using
see
http://www.postgresql.org/docs/8.1/interactive/charset.html
All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.
CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
It has to do with the collation you are using
see
http://www.postgresql.org/docs/8.1/interactive/charset.html
All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.
CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.
Justin Graf wrote: On 2/8/2010 7:09 PM, Fredric Fredricson wrote: Hi!
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
It has to do with the collation you are using
I use locale en_US.UTF-8 but why this should affect how leading characters in strings are ignored is beyond me.
Another, in my mind, counter-intuitive example:
# create table tmp ( x text ) ;
# insert into tmp(x) (values ('a'),('-b'),('c'),('aa'),('---a-b'),('ac'),('1'),('-2'),('+3'),('4'),('-'),('+')) ;
# select * from tmp order by x ;
x
--------
-
+
1
-2
+3
4
a
aa
---a-b
ac
-b
c
(12 rows)
In what universe would you expect this sort order? And how to make it 'sane'?
I found a work-around, "order by ascii(x),x", but this continues to baffle me.
It seems to me that if there are any alphanumeric characters in the string the rest are ignored in the sort.
Where did this rule come from?
I really would appreciate an explanation for this behavior.
/Fredric
PS. I was wrong about the server version, it is 8.3.8.
New to the list with a question that I cannot find the answer to in the manual or on the internet but I suspect is trivial. If somebody could point me in the correct direction I would be greatful.
This is what I do (condensed, of course):
# create table tmp ( x text ) ;
CREATE TABLE
# insert into tmp(x) values ('a'),('c'),('-b') ;
INSERT 0 3
# select * from tmp order by x ;
x
----
a
-b
c
(3 rows)
It has to do with the collation you are using
I use locale en_US.UTF-8 but why this should affect how leading characters in strings are ignored is beyond me.
Another, in my mind, counter-intuitive example:
# create table tmp ( x text ) ;
# insert into tmp(x) (values ('a'),('-b'),('c'),('aa'),('---a-b'),('ac'),('1'),('-2'),('+3'),('4'),('-'),('+')) ;
# select * from tmp order by x ;
x
--------
-
+
1
-2
+3
4
a
aa
---a-b
ac
-b
c
(12 rows)
In what universe would you expect this sort order? And how to make it 'sane'?
I found a work-around, "order by ascii(x),x", but this continues to baffle me.
It seems to me that if there are any alphanumeric characters in the string the rest are ignored in the sort.
Where did this rule come from?
I really would appreciate an explanation for this behavior.
/Fredric
PS. I was wrong about the server version, it is 8.3.8.
Вложения
On Tue, Feb 9, 2010 at 1:42 AM, Fredric Fredricson <Fredric.Fredricson@bonetmail.com> wrote: > Justin Graf wrote: > > On 2/8/2010 7:09 PM, Fredric Fredricson wrote: > > Hi! > New to the list with a question that I cannot find the answer to in the > manual or on the internet but I suspect is trivial. If somebody could point > me in the correct direction I would be greatful. It seems you're seeking ASCII or C locale sorting. > # select * from tmp order by x ; > x > -------- > - > + > 1 > -2 > +3 > 4 > a > aa > ---a-b > ac > -b > c > (12 rows) > > In what universe would you expect this sort order? And how to make it > 'sane'? In a library perhaps? > I found a work-around, "order by ascii(x),x", but this continues to baffle > me. It's quite simple. en_US locale, and others like it sort by ignoring things like white space and noise characters so that only letters and numbers count, and things like ñ sort right near n, not at the end or beginning of the table. > It seems to me that if there are any alphanumeric characters in the string > the rest are ignored in the sort. > Where did this rule come from? > I really would appreciate an explanation for this behavior. It's been around quite some time. I'm afraid I'll have to defer to some other expert on the exact history.
On Tue, Feb 9, 2010 at 1:42 AM, Fredric Fredricson <Fredric.Fredricson@bonetmail.com> wrote: > I use locale en_US.UTF-8 but why this should affect how leading characters > in strings are ignored is beyond me. P.s. this page may shed some light on the subject: http://en.wikipedia.org/wiki/Collation
Scott Marlowe wrote:
Alas, not my problem. I have a work around that works for my little part of the universe. At least for now.
You live, you learn.
/Fredric
OK, thanks. It did shed some light on the subject. Only I wonder what would happen if these sort algorithms where used on things like article numbers in the industry. That would confuse the hell out of the people there.On Tue, Feb 9, 2010 at 1:42 AM, Fredric Fredricson <Fredric.Fredricson@bonetmail.com> wrote:I use locale en_US.UTF-8 but why this should affect how leading characters in strings are ignored is beyond me.P.s. this page may shed some light on the subject: http://en.wikipedia.org/wiki/Collation
Alas, not my problem. I have a work around that works for my little part of the universe. At least for now.
You live, you learn.
/Fredric
Вложения
On Tue, Feb 9, 2010 at 11:21, Fredric Fredricson <Fredric.Fredricson@bonetmail.com> wrote: > Scott Marlowe wrote: > > On Tue, Feb 9, 2010 at 1:42 AM, Fredric Fredricson > <Fredric.Fredricson@bonetmail.com> wrote: > > > I use locale en_US.UTF-8 but why this should affect how leading characters > in strings are ignored is beyond me. > > > P.s. this page may shed some light on the subject: > > http://en.wikipedia.org/wiki/Collation > > OK, thanks. It did shed some light on the subject. Only I wonder what would > happen if these sort algorithms where used on things like article numbers in > the industry. That would confuse the hell out of the people there. In case you are storing something like that, you may be better off using bytea instead of text. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/