Обсуждение: Select duplicated values

Поиск

Список

Период

Сортировка

Select duplicated values

От

jeffrey

Дата:

21 ноября 2011 г., 18:45:33

Lets say that the primary key column is A.  I am trying to select all
the rows with duplicated values in columns B, C, and D.

I am not too experienced in SQL syntax, and I've used the following:
select A from table_name where B+C+D in (select B+C+D from table_name
group by B+C+D having count(*)>1  )

I'm looking for a better way, since I am just adding the three columns
together right now.

Jeffrey

Re: Select duplicated values

От

Pete Yunker

Дата:

21 ноября 2011 г., 19:19:48

Another option is to perform a self-join on columns B, C, and D (filtering out the 'same' record where a=a) instead of using the sub-select. This may yield better performance depending on the size of the table. Also, I don't believe the concatenation / sub-select will work if all of B, C, and D are all NULL.

SELECT x.a

FROM my_table x

INNER JOIN my_table y

ON ((x.b = y.b) or (x.b IS NULL AND y.b IS NULL))

AND ((x.c = y.c) or (x.c IS NULL AND y.c IS NULL))

AND ((x.d = y.d) or (x.d IS NULL AND y.d IS NULL))

AND x.a <> y.a

Another alternative to handling the NULL values is to COALESCE them to a value that would never exist in columns B, C, or D. I don't know the datatypes you are using, so I'll just use 'junk' for now.

SELECT x.a

FROM my_table x

INNER JOIN my_table y

ON COALESCE(x.b,'junk') = COALESCE(y.b,'junk')

AND COALESCE(x.c,'junk') = COALESCE(y.c,'junk')

AND COALESCE(x.d,'junk') = COALESCE(y.d,'junk')

AND x.a <> y.a

---

Pete Yunker

Vice President of Data Products

Home Junction, Inc.

On Nov 21, 2011, at 5:23 PM, jeffrey wrote:

Lets say that the primary key column is A. I am trying to select all
the rows with duplicated values in columns B, C, and D.

I am not too experienced in SQL syntax, and I've used the following:
select A from table_name where B+C+D in (select B+C+D from table_name
group by B+C+D having count(*)>1 )

I'm looking for a better way, since I am just adding the three columns
together right now.

Jeffrey

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Select duplicated values

От

David Johnston

Дата:

21 ноября 2011 г., 22:11:49

On Nov 21, 2011, at 17:23, jeffrey <johjeffrey@hotmail.com> wrote:

> Lets say that the primary key column is A.  I am trying to select all
> the rows with duplicated values in columns B, C, and D.
>
> I am not too experienced in SQL syntax, and I've used the following:
> select A from table_name where B+C+D in (select B+C+D from table_name
> group by B+C+D having count(*)>1  )
>
> I'm looking for a better way, since I am just adding the three columns
> together right now.
>
> Jeffrey
>

Do you really want these to evaluate to equal (B, C, D)?

(1, 0, 1) = (0, 1, 1); they both sum to 2 but both B and C are unequal

There is no need to do anything other than list each field individually in both the SELECT and the GROUP BY.

SELECT B, C, D
FROM table
GROUP BY B, C, D
HAVING count(*) > 1;

Whatever possessed you to consider that you had to add them to get what you need? (assuming you don't want my examples
tobe considered equal) Besides the fact it will give you WRONG RESULTS any decent set of GROUP BY examples will show
youthat you can list/use multiple fields in a grouping query. 

Now, if you truly want the two samples above to evaluate to equal then you do need to do some form of consolidation
(likethe adding in your example).  That, however, would be very unusual. 

David J.

Re: Select duplicated values

От

David Johnston

Дата:

21 ноября 2011 г., 22:16:26

On Nov 21, 2011, at 21:11, David Johnston <polobo@yahoo.com> wrote:

> On Nov 21, 2011, at 17:23, jeffrey <johjeffrey@hotmail.com> wrote:
>
>> Lets say that the primary key column is A.  I am trying to select all
>> the rows with duplicated values in columns B, C, and D.
>>
>> I am not too experienced in SQL syntax, and I've used the following:
>> select A from table_name where B+C+D in (select B+C+D from table_name
>> group by B+C+D having count(*)>1  )
>>
>> I'm looking for a better way, since I am just adding the three columns
>> together right now.
>>
>> Jeffrey
>>
>
> Do you really want these to evaluate to equal (B, C, D)?
>
> (1, 0, 1) = (0, 1, 1); they both sum to 2 but both B and C are unequal
>
> There is no need to do anything other than list each field individually in both the SELECT and the GROUP BY.
>
> SELECT B, C, D
> FROM table
> GROUP BY B, C, D
> HAVING count(*) > 1;
>
> Whatever possessed you to consider that you had to add them to get what you need? (assuming you don't want my
examplesto be considered equal) Besides the fact it will give you WRONG RESULTS any decent set of GROUP BY examples
willshow you that you can list/use multiple fields in a grouping query. 
>
> Now, if you truly want the two samples above to evaluate to equal then you do need to do some form of consolidation
(likethe adding in your example).  That, however, would be very unusual. 
>
> David J.
>
>

So, in a sub-select you would do:

... WHERE ROW(B, C, D) IN (SELECT B, C, D FROM ... GROUP BY B, C, D)

Note that the word ROW is optional ( but not the parentheses )

David J.

Edson Carlos Ericksson Richter SimKorp Informática Ltda
Fone:	(51) 3366-7964
Celular:	(51)9318-9766/(51) 8585-0796

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Select duplicated values

Select duplicated values

Re: Select duplicated values

Вложения

Re: Select duplicated values

Re: Select duplicated values

Re: Select duplicated values