Re: De-duplicating rows

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: De-duplicating rows
Дата
Msg-id 4A60265D.1060905@archonet.com
обсуждение исходный текст
Ответ на De-duplicating rows  (Christophe <xof@thebuild.com>)
Ответы Re: De-duplicating rows  (Frank Bax <fbax@sympatico.ca>)
Список pgsql-sql
Christophe wrote:
> The Subject: is somewhat imprecise, but here's what I'm trying to do.  
> For some reason, my brain is locking up over it.
> 
> I'm moving a 7.2 (yes) database to 8.4. >

Big leap. Allow some time for testing your application. I'm not sure if 
7.2 even supported schemas, and there's been loads of tightening up the 
rules for automatic type casting, unicode etc.

> Now, since this database has been production since 7.2 days, cruft has 
> crept in: in particular, there are duplicate email addresses, some with 
> mismatched attributes.  The policy decision by the client is that the 
> correct row is the one with the earliest timestamp.

Something like (untested):

CREATE TEMPORARY TABLE earliest_duplicates AS
SELECT  email AS tgt_email,  min(create_date) AS tgt_date
FROM mytable
GROUP BY email
HAVING count(*) > 1;

DELETE FROM mytable USING earliest duplicates
WHERE email=tgt_email AND create_date > tgt_date;

--   Richard Huxton  Archonet Ltd


В списке pgsql-sql по дате отправления:

Предыдущее
От: Christophe
Дата:
Сообщение: De-duplicating rows
Следующее
От: Frank Bax
Дата:
Сообщение: Re: De-duplicating rows