Обсуждение: Re: select distinct w/order by

Поиск
Список
Период
Сортировка

Re: select distinct w/order by

От
Bob.Henkel@hartfordlife.com
Дата:
Not that this is the issue, but what kind of tool where you using to get
your results back with this "other" database?  Sometimes these fancy GUI
tools like to be smart on you and order things based on something it feels
is correct giving you the impression that the database choose the order
when infact the GUI tool choose the order.  Just a thought


Bob Henkel          651-738-5085
Mutual Funds I/T Woodbury
Hartford Life
500 Bielenberg Drive
Woodbury, MN 55125


|---------+---------------------------------->
|         |           Richard Huxton         |
|         |           <dev@archonet.com>     |
|         |           Sent by:               |
|         |           pgsql-general-owner@pos|
|         |           tgresql.org            |
|         |                                  |
|         |                                  |
|         |           03/31/2004 02:37 PM    |
|         |                                  |
|---------+---------------------------------->

>------------------------------------------------------------------------------------------------------------------------------|
  |
        | 
  |       To:       "John Liu" <johnl@emrx.com>, <pgsql-general@postgresql.org>
        | 
  |       cc:
        | 
  |       Subject:  Re: [GENERAL] select distinct w/order by
        | 

>------------------------------------------------------------------------------------------------------------------------------|




On Wednesday 31 March 2004 18:50, John Liu wrote:
> I know this is an old topic, but it's not easy to find a way around it,
so
> when we migrate SQL from other database to PostgreSQL, it causes a huge
> headache. Here's an extremely simple example -
>
> The original simple SQL -
> select distinct atcode from TMP order by torder;

Can you explain what this means? If I have

atcode | torder
AAA    | 20
BBB    |  5
CCC    | 10
BBB    | 45
CCC    | 27

What order should we get?
You could argue for:
1. BBB,CCC,AAA since that is the order of the min(torder)
2. AAA,CCC,BBB since that is the order of the max(torder)
3. AAA,BBB,CCC if you take the first(torder) you find reading down the page
4. AAA,CCC,BBB if you take the first(torder) but read up the page

Which one should PG pick, and how should it know?

Which one did the other database pick, and why was it right to do so?

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)







*************************************************************************
PRIVILEGED AND CONFIDENTIAL: This communication, including attachments, is for the exclusive use of addressee and may
containproprietary, confidential and/or privileged information.  If you are not the intended recipient, any use,
copying,disclosure, dissemination or distribution is strictly prohibited.  If you are not the intended recipient,
pleasenotify the sender immediately by return e-mail, delete this communication and destroy all copies. 
*************************************************************************


Re: select distinct w/order by

От
Richard Huxton
Дата:
On Wednesday 31 March 2004 22:13, John Liu wrote:
> Tom provided the same logic arguments. I also like the way 'simple is
> better' as long as it sticks with SQL requirement. But in practice, you
> have to face such issue even it's 'catch 22' which depends on the
> application design -

Tom's a smart fella, of course he agreed with me ;-)
I'd argue PG does deal with the issue, by refusing to do handle an unsafe
situation. The "catch 22" only happens if a database supports vague queries.

> For the your case -
> code                  codeid
>
> AAA                       20
> BBB                        5
> CCC                       10
> BBB                       45
> CCC                       27
>
> When issue "select distinct code from test1 order by codeid;"
> One of the database returns using their internal rule (at least it's
> constant itself) -
> code
>
> BBB
> CCC
> AAA

Are you sure it's consistent? If you didn't know which results you were going
to get before testing it, how do you know it's right. Maybe when you tested
it, perhaps you got lucky. And, if the "theoretical" arguments don't convince
you, here's something fairly practical. If the behaviour isn't defined, and
it just happens to work this way, what guarantee will you have that another
database, or event the next version of your current one will give you the
same order?

> It provides one of the arguable result sets.

PG could provide one, but which one, and why should that be the right choice?

> But think about another situation, the result is for sure -
> code                  code2                 codeid
>
> a1                    a                          1
> a2                    a                          2
> b1                    d                          3
> b2                    d                          4
> c1                    c                          5
> c2                    c                          6
>
> select distinct code2 from test2 order by codeid;
> code2
>
> a
> d
> c
>
> It's handy.

You can't have a feature that only works for some cases. In your example,
there are no overlapping codeid ranges on any given code2. This means you can
use either of the min/max sorts I mentioned. PG needs to know what "order by
codeid" means.

> I hope everything is black or white, but it's not. The user has the choice
> at least. But when I use PostgreSQL, I need find an alternative solution to
> handle such issue.

Everything is black and white. You should be able to run the same queries on
the same data in any two databases and get the same results. If you define
your query correctly, that should be the case. (Note it's not your fault the
query is poorly defined, at first glance it looks like a sensible thing to
do. The fact is though, that it's not and the other database shouldn't let
you do it).

Below is a real example from one of my projects - both orders are valid, but
they give different results. You will need to choose one.

promise=> SELECT gal_cnttype FROM lkp_gallery GROUP BY gal_cnttype ORDER BY
min(gal_code);
 gal_cnttype
-------------
 CLILOGO
 MMS
 OPPLOGO
 PICMSG
 MONO
 POLY
 JAVA
 BUNDLE
 AISCRIPT
(9 rows)

promise=> SELECT gal_cnttype FROM lkp_gallery GROUP BY gal_cnttype ORDER BY
max(gal_code);
 gal_cnttype
-------------
 BUNDLE
 JAVA
 MONO
 POLY
 AISCRIPT
 CLILOGO
 MMS
 OPPLOGO
 PICMSG
(9 rows)


--
  Richard Huxton
  Archonet Ltd

Re: select distinct w/order by

От
"John Liu"
Дата:
Don't know why this is not posted ...

-----Original Message-----
From: John Liu [mailto:johnl@emrx.com]
Sent: Wednesday, March 31, 2004 11:50 AM
To: 'pgsql-general@postgresql.org'
Subject: select distinct w/order by

I know this is an old topic, but it's not easy to find a way around it, so
when we migrate SQL from other database to PostgreSQL, it causes a huge
headache. Here's an extremely simple example -

The original simple SQL -
select distinct atcode from TMP order by torder;

(it'll error out in PostgreSQL, although SQL92 extension may allow it;
there's time you just can't do "select distinct atcode,torder from TMP order
by torder"!!)

My desire result -
 HGB
 HCT
 WBC
 RBC
 MCV
 MCH
 MCHC
 RDW
 RDWSD
 PLT
 DIFF | TYPE
 SEGS
 LYMPHS
 MONOS
 EOS
 BASOS

I tried to rewrite the above simple query in PostgreSQL as - select distinct
atcode from (select atcode,torder from TMP order by torder) t;

But the return results are not what I want -  BASOS  DIFF | TYPE  EOS  HCT
HGB  LYMPHS  MCH  MCHC  MCV  MONOS  PLT  RBC  RDW  RDWSD  SEGS  WBC

Can anybody provide a real/general solution to the above practical problem?
(Tom?) This causes postgreSQL users too much time and headache.

Thanks.
johnl