Re: Need magic for identifieing double adresses

Поиск
Список
Период
Сортировка
От Gary Chambers
Тема Re: Need magic for identifieing double adresses
Дата
Msg-id AANLkTi=jQ+sgxu=VHJft__PZrwykDL4LvEKieQAn2wak@mail.gmail.com
обсуждение исходный текст
Ответ на Need magic for identifieing double adresses  (Andreas <maps.on@gmx.net>)
Список pgsql-general
Andreas,

> Relevant fields could be  name, street, zip, city, phone
> Is there a way to do something like this with postgresql ?
> I fear this will need still a lot of manual sorting and searching even when
> potential peers get automatically identified.

One of the techniques I use to increase the odds of detecting
duplicates is to trim each column, remove all internal whitespace,
coalesce it into a single string, and calculate an MD5 (some other
hash function may be better) hash.  It's not perfect (we are dealing
with humans, after all), but it helps.

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */

В списке pgsql-general по дате отправления:

Предыдущее
От: Darren Duncan
Дата:
Сообщение: Re: Need magic for identifieing double adresses
Следующее
От: Peter Roethlisberger
Дата:
Сообщение: libssl issue ?