Re: Approximate string matching?

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: Approximate string matching?
Дата
Msg-id web-834815@davinci.ethosmedia.com
обсуждение исходный текст
Ответ на Re: Approximate string matching?  ("Joshua b. Jore" <josh@greentechnologist.org>)
Ответы What is object-relational?  ("Joshua b. Jore" <josh@greentechnologist.org>)
Список pgsql-novice
Joshua,

This is *not* a novice question.  I'm not sure where else you'd post it
 though.

> Ok, the basic question: does anyone have any approximate string
>  matching
> algorithms coded such that PostgreSQL can use it effeciently? I would
>  like
> to handle inserts/deletes. I already have a perl and LotusScript
>  (that's
> for Domino) implementation but I haven't ever been able to get the
>  perl
> module to install right with PostgreSQL.

Metaphone, Soundex, and Levenshtein were built for postgresql by Joe
 Conway.   Find them in the /contrib directory.

> Translations:
> Wu-Manber k-differences: it's an algorithm that measures how many
>  edits
> are required to turn one string into another. k is the number of
>  edits.
> This is also known as the Levenschtein distance. I'm getting this
>  from the
> Perl Algorithm book.

Levenschtien is available in /contrib.  It works well for the database
 I use it on; though that only has 7000 records, so you'll have to test
 really large tables.

If you're deduplicating, I wrote a sophisticated name-alike function
 using Levenschtein and Metaphone in PL/pgSQL and posted it to Roberto
 Mello's function library  (accessable from TechDocs).

-Josh Berkus

В списке pgsql-novice по дате отправления:

Предыдущее
От: "Joshua b. Jore"
Дата:
Сообщение: Re: Approximate string matching?
Следующее
От: Daniel Grob
Дата:
Сообщение: rules over multiple tables