Re: Getting a sample data set.

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Getting a sample data set.
Дата
Msg-id 5AD6B06F-AEBF-43FF-AEA4-36C8BEC3DB42@solfertje.student.utwente.nl
обсуждение исходный текст
Ответ на Getting a sample data set.  ("James B. Byrne" <byrnejb@harte-lyne.ca>)
Ответы Re: Getting a sample data set.
Список pgsql-general
On 18 Jan 2011, at 19:02, James B. Byrne wrote:

> Given a table "shipments" having a column called "mode" I want to
> extract one entire shipment row (all columns) for each distinct
> value of mode.  Assuming that there are 1700 rows and that there are
> just five distinct values in use for mode then I want to return five
> rows with all their columns and each one having a different value
> for mode.

> But now I am curious how this is done in plain SQL. I have have not
> found any useful guide as to how to approach this problem in the
> reference materials I have to hand.  I cannot believe that I am the
> first person to require this sort of thing of SQL so if anyone can
> point me to a reference that explicitly sets out how to accomplish
> this I would greatly appreciate it.


Postgres has it's own extension for that, namely DISTINCT ON, used as follows:

SELECT DISTINCT ON (mode) mode FROM shipments ORDER BY mode;

For consistent results in the other column some more ordering would be required, or Postgres would just be returning
thefirst row per mode that it encounters. Then again, that's often what people want in this case anyway. 

Standard SQL alternatives tend to get complex, using self-joins to weed out all the records you don't want (the exact
termfor such joins escapes me right now, that would help with Googling if you're looking for examples). 
Basically you do something like:
SELECT s1.mode
  FROM shipments AS s1
 WHERE NOT EXISTS (
    SELECT NULL
      FROM shipments AS s2
     WHERE s1.mode = s2.mode
       AND s1.somecolumn < s2.somecolumn
    )

Basically you exclude all the records with the same mode that have a larger value for somecolumn than the lowest you
encountered.Only the records with the lowest value for somecolumn remain for each mode. If you turn the sign around for
thatlast condition you'd get the highest value instead. 

Important here is that somecolumn doesn't contain any values for the same mode where its values would be considered
equal,or you end up with multiple matches for that mode. 

It gets extra interesting if you don't have any columns that are distinct per mode. In such cases you can join your
table(s)against generate_series() or use a windowing function with ranking. This has gotten much easier with our new
CTE's(see the WITH keyword). 

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4d35dad711701679817192!



В списке pgsql-general по дате отправления:

Предыдущее
От: "James B. Byrne"
Дата:
Сообщение: Getting a sample data set.
Следующее
От: "George Weaver"
Дата:
Сообщение: Re: Case Insensitive Foreign Key Constraint