Re: How to find double entries
| От | Tom Lane |
|---|---|
| Тема | Re: How to find double entries |
| Дата | |
| Msg-id | 21481.1208316212@sss.pgh.pa.us обсуждение |
| Ответ на | How to find double entries (Andreas <maps.on@gmx.net>) |
| Ответы |
Re: How to find double entries
|
| Список | pgsql-sql |
Andreas <maps.on@gmx.net> writes:
> I'd like to identify and then merge records of e.g. 'google', 'gogle',
> 'guugle'
> Then I want to match abbrevations like 'A-Company Ltd.', 'a company
> ltd.', 'A-Company Limited'
> Is there a way to do this?
> It would be OK just to list candidats up to be manually checked afterwards.
There are some functions in contrib/fuzzystrmatch that seem like they'd
help you find candidate duplicates. contrib/pg_trgm and text search
might also offer promising tools.
What's really a duplicate sounds like a judgment call here, so you
probably shouldn't even think of automating it completely.
regards, tom lane
В списке pgsql-sql по дате отправления: