Re: Distinct on a non-sort column

Поиск
Список
Период
Сортировка
От Cstdenis
Тема Re: Distinct on a non-sort column
Дата
Msg-id 4EB5AF32.1080404@on-track.ca
обсуждение исходный текст
Ответ на Re: Distinct on a non-sort column  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Distinct on a non-sort column  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On 11/5/2011 12:49 PM, Tom Lane wrote:
Cstdenis <lists@on-track.ca> writes:
I am trying to write a query that selects recent submissions (sorted by 
submission_date) but only selects the most recent one for each user_id.
example query: /select distinct on (user_id) * from stories order by 
date_submitted desc limit 10;/
However postgres will not allow me to filter out duplicate rows with 
distinct unless I sort on that column, which would product useless 
results for me.
Do the DISTINCT ON in a sub-query, with an ORDER BY appropriate for that
task, and then re-sort the rows the way you want them presented in the
outer query.
SELECT ... FROM	(SELECT DISTINCT ON ... ORDER BY ...) ssORDER BY ...;
		regards, tom lane

If I understand that you are proposing as
select * from
            (select distinct on (user_id) * from stories as s order by user_id) as foo
order by date_submitted desc limit 10;

I think it has the problem of which of the stories by that user is selected is random rather than the most recent being guaranteed (because the distinct is done before the sort). Or am I misunderstanding this?


The suggestions by others of using max(date_submitted) may be a good workaround for this, but I also need to do the same thing sorted by a calculated score value which I do not think will be sufficiently unique for Tair's suggestion.

В списке pgsql-general по дате отправления:

Предыдущее
От: Antonio Goméz Soto
Дата:
Сообщение: Re: What is *wrong* with this query???
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Distinct on a non-sort column