Обсуждение: Efficiently obtaining (any) one record per group.

Поиск
Список
Период
Сортировка

Efficiently obtaining (any) one record per group.

От
Allan Kamau
Дата:
Hi all,

I have a large table that contains redundancies as per one field.

I am looking for a way to identify (or extract) a non redundant set of
rows ( _any_ one record per group) from this table and for each record
of this "distinct" set of rows, I would like to capture it's other
fields.
Below is a simplified example. In this example I would like to base
groups on the value of field "reading".

CREATE TABLE foo
(id INTEGER
,reading INTEGER
,entry_date TIMESTAMP
,source TEXT
,primary key(id)
);

INSERT INTO foo(1,55,'2010-04-01 06:31:13','A');
INSERT INTO foo(2,55,'2010-04-01 06:31:20','X');
INSERT INTO foo(3,45,'2010-04-01 06:38:02','P');
INSERT INTO foo(6,55,'2010-04-01 06:21:44','B');
INSERT INTO foo(4,34,'2010-04-01 06:51:24','K');
INSERT INTO foo(8,61,'2010-04-01 06:22:03','A');
INSERT INTO foo(9,34,'2010-04-01 06:48:07','C');


Desired output (any record selected based on "reading" field).

id,reading,entry_date,source
1,55,'2010-04-01 06:31:13','A'
3,45,'2010-04-01 06:38:02','P'
9,34,'2010-04-01 06:48:07','C'
8,61,'2010-04-01 06:22:03','A'



I am worried that using min() and group by to generate a relation
which is then used in a join may be slow.


SELECT
a.*
FROM
foo a
JOIN
(
SELECT
min(a.id)AS id_min
FROM
foo a
GROUP BY
a.reading
)b
ON
a.id=b.id_min
;

How is the performance of rank() (window function) in general?


Allan.

Re: Efficiently obtaining (any) one record per group.

От
Scott Marlowe
Дата:
On Sat, Jul 24, 2010 at 12:56 AM, Allan Kamau <kamauallan@gmail.com> wrote:
> Hi all,
>
> I have a large table that contains redundancies as per one field.
>
> I am looking for a way to identify (or extract) a non redundant set of
> rows ( _any_ one record per group) from this table and for each record
> of this "distinct" set of rows, I would like to capture it's other
> fields.

Would "Select distinct on ... " work for you?

http://www.postgresql.org/docs/8.4/static/sql-select.html

search the page for "distinct on".

Re: Efficiently obtaining (any) one record per group.

От
Allan Kamau
Дата:
On Sat, Jul 24, 2010 at 10:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sat, Jul 24, 2010 at 12:56 AM, Allan Kamau <kamauallan@gmail.com> wrote:
>> Hi all,
>>
>> I have a large table that contains redundancies as per one field.
>>
>> I am looking for a way to identify (or extract) a non redundant set of
>> rows ( _any_ one record per group) from this table and for each record
>> of this "distinct" set of rows, I would like to capture it's other
>> fields.
>
> Would "Select distinct on ... " work for you?
>
> http://www.postgresql.org/docs/8.4/static/sql-select.html
>
> search the page for "distinct on".
>

Thank you Scott, you have made my day, this is exactly what I was looking for.

Allan.