Re: Removing duplicates
От | Dan MacNeil |
---|---|
Тема | Re: Removing duplicates |
Дата | |
Msg-id | 003a01c1bee4$e99372e0$b11a9318@prometheus обсуждение исходный текст |
Ответ на | Removing duplicates (Matthew Hagerty <matthew@brwholesale.com>) |
Список | pgsql-sql |
There is software (saddly not open source) that standardizes your addresses. Once your addresses are standardized, you can check them for duplicates. There is also shrinkwrapped software to eleminate duplicates. It might be cheaper & quicker to buy this software... You should probably poke around on www.usps.gov Some links I came up with. http://www.casscertification.net/cassfaq.html http://www.usps.com/ncsc/addressservices/addressqualityservices/addres scorrection.htm http://search.dmoz.org/cgi-bin/search?search=CASS+postal ----- Original Message ----- From: "Matthew Hagerty" <matthew@brwholesale.com> To: <pgsql-sql@postgresql.org> Sent: Tuesday, February 26, 2002 10:10 AM Subject: [SQL] Removing duplicates > Greetings, > > I have a customer database (name, address1, address2, city, state, zip) and > I need a query (or two) that will give me a mailing list with the least > amount of duplicates possible. I know that precise matching is not > possible, i.e. "P.O. Box 123" will never match "PO Box 123" without some > data massaging, but if I can isolate even 50% of any duplicates, that would > help greatly. > > Also, any suggestions on which parameters to check the duplicates for? My > first thoughts were to make sure there were no two addresses the same in > the same zip code. Any insight (or examples) would be greatly appreciated. > > Thank you, > Matthew > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > >
В списке pgsql-sql по дате отправления: