Обсуждение: Selecting distinct records
Have a table that is populated by external radius reporting which contains
name, ipaddress, sessionid, refid, and others
None of the fields are unique, but the combination of those 4 listed should
reasonably be unique. There is a certain type of interaction with external
hardware that results in multple entries into the table, resulting in multiple
duplicate entries.
What we need to to is build a select that omits any duplicate records from the
output, but I can't get a distinct to work given the way we need the data
sorted.
For example,
Name IPaddress sessionid refid
User1 201.201 1234 5678 first user
User2 201.201 1235 5679 same ip as first user
User3 201.202 3234 5670 same sessid as first user
User4 201.203 3236 5678 same refid as first user
User1 201.202 4234 5678 first user new entry
User1 201.202 4234 5678 DUPLICATE
User1 201.202 4234 5678 DUPLICATE
User2 201.203 1234 5671 same ip as user 4
User3 201.204 2234 5672 unique
if we do distinct on sessionid then put a where clause for the user=User1 we end
up missing some of user1's entries where another users identical sessionid
appears first.
What I am looking to do is
- grab every record for $user
- remove any records that have identical ipaddress+sessionid+refid
ie: turn
user1 201.102 1234 5678
user1 201.102 1234 5678
user1 201.102 1234 5678
into
user1 201.102 1234 5678
- then sort the results by date_time or something else
Just can't get it to do all those things at the same time. Any thoughts, or am
I not making sense?
thanks
Dave
>What I am looking to do is >- grab every record for $user >- remove any records that have identical ipaddress+sessionid+refid >- then sort the results by date_time or something else this last requirement is where the problem is... is you do a sum() or order by in the select statement you get and error, for example; select distinct on (sessionid) sum(sessiontime) from logs where name='joeblowuser' and datetime > 1036040400; ERROR: Attribute logs.sessionid must be GROUPed or used in an aggregate function Same if you have to order by datetime or something... Dave
"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes:
>> What I am looking to do is
>> - grab every record for $user
>> - remove any records that have identical ipaddress+sessionid+refid
>> - then sort the results by date_time or something else
> this last requirement is where the problem is...
I think you need two levels of select. To do the DISTINCT ON you must
sort by the columns to be distinct'd. You can re-sort the result in
an outer select. For example:
select * from
(select distinct on (ten) * from tenk1 order by ten) ss
order by hundred;
In practice the inner select's order by will need additional sort
columns, with which you control which row gets chosen out of any set of
duplicates.
regards, tom lane
>"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes: >>> What I am looking to do is >>> - grab every record for $user >>> - remove any records that have identical ipaddress+sessionid+refid >>> - then sort the results by date_time or something else > >> this last requirement is where the problem is... > >I think you need two levels of select. To do the DISTINCT ON you must >sort by the columns to be distinct'd. You can re-sort the result in >an outer select. For example: > >select * from > (select distinct on (ten) * from tenk1 order by ten) ss >order by hundred; is this vailable in 7.0.3, I get an error at the second select Dave