Re: Removing duplicate entries

Поиск
Список
Период
Сортировка
От Michael Fuhr
Тема Re: Removing duplicate entries
Дата
Msg-id 20060112044830.GA90783@winnie.fuhr.org
обсуждение исходный текст
Ответ на Re: Removing duplicate entries  ("Scott Ford" <Scott.Ford@bullfrogpower.com>)
Список pgsql-novice
On Wed, Jan 11, 2006 at 04:21:45PM -0500, Scott Ford wrote:
> customer
>     customer_id (pk)
>     ...
>
> documentation
>     documentation_id (pk)
>     customer_id (fk)
>     document_type_id (fk)
>     accepted
>
> document_types
>     document_type_id (pk)
>     document_name
>     ...
>
> What I want to do is remove duplicate entries for the same
> customer_id/document_type_id for only a certain document_type_id.

Will a modified version of my earlier query work?  For example,
suppose I start with this data:

SELECT * FROM doc ORDER BY cust_id, doc_type_id, doc_id;
 doc_id | cust_id | doc_type_id
--------+---------+-------------
      1 |       1 |           1
      2 |       1 |           1
      3 |       1 |           2
      4 |       1 |           2
      5 |       1 |           2
      6 |       1 |           3
      7 |       1 |           3
      8 |       2 |           1
      9 |       2 |           2
     10 |       2 |           3
(10 rows)

Then I run this delete to remove duplicates for doc_type_id 2:

DELETE FROM doc
WHERE doc_type_id = 2 AND doc_id NOT IN (
  SELECT min(doc_id)
  FROM doc
  WHERE doc_type_id = 2 -- not necessary but probably more efficient
  GROUP BY cust_id, doc_type_id
);

I end up with this:

SELECT * FROM doc ORDER BY cust_id, doc_type_id, doc_id;
 doc_id | cust_id | doc_type_id
--------+---------+-------------
      1 |       1 |           1
      2 |       1 |           1
      3 |       1 |           2
      6 |       1 |           3
      7 |       1 |           3
      8 |       2 |           1
      9 |       2 |           2
     10 |       2 |           3
(8 rows)

cust_id 1's duplicates for doc_type_id 2 have been removed (doc_id
4 and 5) but cust_id 1's duplicates for doc_type_id 1 and 3 remain.
cust_id 2 had no duplicates; all of its records remain.

Is that what you're looking for?  If not then please post some
sample data and describe exactly which records you want to delete.
It would be helpful if you post the example as CREATE TABLE and
INSERT statements that people can load into their own database.

--
Michael Fuhr

В списке pgsql-novice по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Sequential Scans On Complex Query With UNION - see why this fails
Следующее
От: Brendan Duddridge
Дата:
Сообщение: Re: Comparing databases