Thanx Merlin, have a nice one (vacation)!
It turns out I have'nt described the problem accurately=(
Data may actually differ in two of the columns (the varchar
columns).
I still want to remove rows which share the same data in those
two columns and have the date column
set to NULL.
I.e. row 1,2,3 have:
column1 = 'foo', column2 = 'hey' and the date column =
NULL
row 4,5,6 have:
column1 = 'brat', column2 = 'yo' and the date column =
NULL
I want to keep just one of the 1 - 3 rows and one of the 4 - 6
rows..
I will try Merlins and Scotts solutions tomorrow. Anyone know
if I need to modify Merlins and/or Scotts
solutions to solve this new situation?
/best regards, Håkan
>----Ursprungligt meddelande----
>Från: mmoncure@gmail.com
>Datum: 09-09-2007 15:42
>Till: "Håkan Jacobsson"<hakan.jacobsson99@bredband.net>
>Kopia: <pgsql-general@postgresql.org>
>Ärende: Re: [GENERAL] SQL for Deleting all duplicate entries
>
>On 9/9/07, Håkan Jacobsson <hakan.jacobsson99@bredband.net>
wrote:
>> Merlin,
>>
>> Its just about three columns - not any column. Two columns
are
>> varchars and the third is
>> a date. The date column value is NULL for the rows for
which
>> I want to delete the duplicates.
>
>
>getting ready to go on vacation :). The idea is you want to
write a
>query that pulls out the data you want to keep. If you have
a table
>with 6 fields, f1 though f6 and you only want one record with
>identical values of f1, f2, f3, you might do:
>
>begin;
>create temp table scratch as
> select f1, f2, f3, max(f4), max(f5), max(f6) from foo group
by f1, f2, f3;
>
>truncate foo;
>
>insert into foo select * from scratch;
>commit;
>
>You can replace max() with any suitable aggregate you deem
gets you
>the best data out of the record. If you are feeling really
clever,
>you can write a custom aggregate for the record type (it's
easier than
>you think!)
>
>merlin
>