Обсуждение: count (DISTINCT field) OVER ()
I would like to implement the equivalent of "count (DISTINCT field) OVER
()":
SELECT
id, name, similarity(name, 'Tooneyvara') as delta,
count (id) OVER() AS cnt
FROM vtown
WHERE
similarity(name, 'Tooneyvara') > 0.1
ORDER BY delta DESC
produces result:
1787 Toomyvara 0.5 4
1787 Toomevara 0.4 4
1700 Ardcroney 0.105 4
1788 Townsfield 0.1 4
What I would like is a "3" in the cnt column (ignoring id duplicates).
How can I achieve this?
The best I can come up with is:
SELECT id, name,delta, count (*) OVER()
FROM (
SELECT DISTINCT ON (id)
id, name, similarity(name, 'Tooneyvara') as delta
FROM vtown
WHERE
similarity(name, 'Tooneyvara') > 0.1
ORDER BY id, delta DESC
) AS x
ORDER by delta DESC
--
Best Regards,
Tarlika Elisabeth Schmitz
On Nov 9, 2011, at 18:52, Tarlika Elisabeth Schmitz <postgresql6@numerixtechnology.de> wrote: > I would like to implement the equivalent of "count (DISTINCT field) OVER > ()": > > > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > count (id) OVER() AS cnt > FROM vtown > WHERE > similarity(name, 'Tooneyvara') > 0.1 > ORDER BY delta DESC > > produces result: > 1787 Toomyvara 0.5 4 > 1787 Toomevara 0.4 4 > 1700 Ardcroney 0.105 4 > 1788 Townsfield 0.1 4 > > What I would like is a "3" in the cnt column (ignoring id duplicates). > > > How can I achieve this? > > > The best I can come up with is: > > SELECT id, name,delta, count (*) OVER() > FROM ( > SELECT DISTINCT ON (id) > id, name, similarity(name, 'Tooneyvara') as delta > FROM vtown > WHERE > similarity(name, 'Tooneyvara') > 0.1 > ORDER BY id, delta DESC > ) AS x > ORDER by delta DESC > > -- > > Best Regards, > Tarlika Elisabeth Schmitz > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general 1. Write your main query in a WITH (CTE) 2. Query #1 with appropriate GROUP BY clause (CTE) 3. In the main statement JOIN 1 and 2 David J.
Tarlika Elisabeth Schmitz, 10.11.2011 00:52:
> I would like to implement the equivalent of "count (DISTINCT field) OVER ()":
>
>
> SELECT
> id, name, similarity(name, 'Tooneyvara') as delta,
> count (id) OVER() AS cnt
> FROM vtown
> WHERE
> similarity(name, 'Tooneyvara')> 0.1
> ORDER BY delta DESC
>
> produces result:
> 1787 Toomyvara 0.5 4
> 1787 Toomevara 0.4 4
> 1700 Ardcroney 0.105 4
> 1788 Townsfield 0.1 4
>
> What I would like is a "3" in the cnt column (ignoring id duplicates).
>
>
This should do it:
SELECT id,
name,
delta,
sum(case when rn = 1 then rn else null end) over() as distinct_id_count
FROM (
SELECT
id, name, similarity(name, 'Tooneyvara') as delta,
row_number() OVER(partition by id) AS rn
FROM vtown
WHERE
similarity(name, 'Tooneyvara') > 0.1
) t
ORDER BY delta DESC
On Thu, 10 Nov 2011 10:02:36 +0100
Thomas Kellerer <spam_eater@gmx.net> wrote:
>Tarlika Elisabeth Schmitz, 10.11.2011 00:52:
>> I would like to implement the equivalent of "count (DISTINCT id)
>> OVER ()":
>>
>>[...]
>>
>> produces result:
>> id, name, delta, cnt
>> 1787 Toomyvara 0.5 4
>> 1787 Toomevara 0.4 4
>> 1700 Ardcroney 0.105 4
>> 1788 Townsfield 0.1 4
>>
>
>This should do it:
>
>SELECT id,
> name,
> delta,
> sum(case when rn = 1 then rn else null end) over() as
> distinct_id_count
>FROM (
> SELECT
> id, name, similarity(name, 'Tooneyvara') as delta,
> row_number() OVER(partition by id) AS rn
> FROM vtown
> WHERE
> similarity(name, 'Tooneyvara') > 0.1
>) t
>ORDER BY delta DESC
>
I like you suggestion, Thomas. It is not that dissimilar from but
cleaner than my original SELECT ... FROM (SELECT DISTINCT ON(id)
attempt.
It's also very slightly faster.
Here's another, slightly shorter, variation of your suggestion:
SELECT id, name, delta,
max(rank) OVER() as cnt
FROM (
SELECT
id, name, similarity(name, 'Tooneyvara') as delta,
dense_rank() OVER(ORDER BY id) AS rank
FROM vtown
WHERE
similarity(name, 'Tooneyvara') > 0.1
) t
ORDER BY delta DESC
Tarlika Elisabeth Schmitz, 10.11.2011 11:24: >> SELECT id, >> name, >> delta, >> sum(case when rn = 1 then rn else null end) over() as >> distinct_id_count >>FROM ( >> SELECT >> id, name, similarity(name, 'Tooneyvara') as delta, >> row_number() OVER(partition by id) AS rn >> FROM vtown >> WHERE >> similarity(name, 'Tooneyvara')> 0.1 >> ) t >> ORDER BY delta DESC >> > > > I like you suggestion, Thomas. It is not that dissimilar from but > cleaner than my original SELECT ... FROM (SELECT DISTINCT ON(id) > attempt. > It's also very slightly faster. > > Here's another, slightly shorter, variation of your suggestion: > > SELECT id, name, delta, > max(rank) OVER() as cnt > FROM ( > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > dense_rank() OVER(ORDER BY id) AS rank > FROM vtown > WHERE > similarity(name, 'Tooneyvara')> 0.1 > ) t > ORDER BY delta DESC Nice trick with the dense_rank(), never thought of that. Regards Thomas