Обсуждение: Selecting distinct records

Поиск
Список
Период
Сортировка

Selecting distinct records

От
"Dave [Hawk-Systems]"
Дата:
Have a table that is populated by external radius reporting which contains
    name, ipaddress, sessionid, refid, and others

None of the fields are unique, but the combination of those 4 listed should
reasonably be unique. There is a certain type of interaction with external
hardware that results in multple entries into the table, resulting in multiple
duplicate entries.

What we need to to is build a select that omits any duplicate records from the
output, but I can't get a distinct to work given the way we need the data
sorted.

For example,
Name    IPaddress    sessionid    refid
User1    201.201    1234        5678    first user
User2    201.201    1235        5679    same ip as first user
User3    201.202    3234        5670    same sessid as first user
User4    201.203    3236        5678    same refid as first user
User1    201.202    4234        5678    first user new entry
User1    201.202    4234        5678    DUPLICATE
User1    201.202    4234        5678    DUPLICATE
User2    201.203    1234        5671    same ip as user 4
User3    201.204    2234        5672    unique

if we do distinct on sessionid then put a where clause for the user=User1 we end
up missing some of user1's entries where another users identical sessionid
appears first.

What I am looking to do is
- grab every record for $user
- remove any records that have identical ipaddress+sessionid+refid
    ie: turn
        user1        201.102    1234        5678
        user1        201.102    1234        5678
        user1        201.102    1234        5678
    into
        user1        201.102    1234        5678
- then sort the results by date_time or something else

Just can't get it to do all those things at the same time.  Any thoughts, or am
I not making sense?

thanks

Dave



Re: Selecting distinct records

От
"Dave [Hawk-Systems]"
Дата:
>What I am looking to do is
>- grab every record for $user
>- remove any records that have identical ipaddress+sessionid+refid
>- then sort the results by date_time or something else

this last requirement is where the problem is...

is you do a sum() or order by in the select statement you get and error,  for
example;

select distinct on (sessionid) sum(sessiontime) from logs where
name='joeblowuser' and datetime > 1036040400;

ERROR:  Attribute logs.sessionid must be GROUPed or used in an aggregate
function

Same if you have to order by datetime or something...

Dave



Re: Selecting distinct records

От
Tom Lane
Дата:
"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes:
>> What I am looking to do is
>> - grab every record for $user
>> - remove any records that have identical ipaddress+sessionid+refid
>> - then sort the results by date_time or something else

> this last requirement is where the problem is...

I think you need two levels of select.  To do the DISTINCT ON you must
sort by the columns to be distinct'd.  You can re-sort the result in
an outer select.  For example:

select * from
  (select distinct on (ten) * from tenk1 order by ten) ss
order by hundred;

In practice the inner select's order by will need additional sort
columns, with which you control which row gets chosen out of any set of
duplicates.

            regards, tom lane

Re: Selecting distinct records

От
"Dave [Hawk-Systems]"
Дата:
>"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes:
>>> What I am looking to do is
>>> - grab every record for $user
>>> - remove any records that have identical ipaddress+sessionid+refid
>>> - then sort the results by date_time or something else
>
>> this last requirement is where the problem is...
>
>I think you need two levels of select.  To do the DISTINCT ON you must
>sort by the columns to be distinct'd.  You can re-sort the result in
>an outer select.  For example:
>
>select * from
>  (select distinct on (ten) * from tenk1 order by ten) ss
>order by hundred;

is this vailable in 7.0.3, I get an error at the second select

Dave