Re: Fuzzy matching?
От | Joe Conway |
---|---|
Тема | Re: Fuzzy matching? |
Дата | |
Msg-id | 02ee01c119e3$3a128f50$48d210ac@jecw2k1 обсуждение исходный текст |
Ответ на | Fuzzy matching? ("Josh Berkus" <josh@agliodbs.com>) |
Ответы |
Re: Fuzzy matching?
|
Список | pgsql-sql |
> And the fuzzy_match should return True if the two phrases are no more > than that number of characters different. Thus, we should get: > > fuzzy_match('Thornton','Tornton',1) = TRUE > fuzzy_match('Thornton','Torntin',1) = FALSE > fuzzy_match('Thornton','Torntin',2) = TRUE > > Unfortunately, I cannot think of a way to make this happen in a function > without cycling through all the possible permutations of characters for > both words or doing some character-by-character comparison with > elaborate logic for placement. Either of these approaches would be very > slow, and completely unsuitable for column comparisons on large tables. > > Can anyone suggest some shortcuts here? Perhaps using pl/perl or > something similar? Sounds like you want something along the lines of soundex or metaphone? I don't see either function in PostgreSQL, but take a look at the PHP manual to see examples: http://www.php.net/manual/en/function.soundex.php , http://www.php.net/manual/en/function.metaphone.php I looked at the soundex function in the PHP source, and it looks like it would be fairly easy to port to a Postgres C function. The algorithm itself comes from Donald Knuth in "The Art Of Computer Programming, vol. 3: Sorting And Searching", Addison-Wesley (1973), pp. 391-392. HTH, -- Joe
В списке pgsql-sql по дате отправления: