Fuzzy string matching of product names

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Fuzzy string matching of product names
Дата
Msg-id l2ndb471ace1004051210t291d8786h691c241c163dec75@mail.gmail.com
обсуждение исходный текст
Ответы Re: Fuzzy string matching of product names  (Bill Moran <wmoran@potentialtech.com>)
Re: Fuzzy string matching of product names  (Brian Modra <brian@zwartberg.com>)
Список pgsql-general
Hello,

At the moment, users of my application, which runs on 8.4.3, may
search for products in a way that is implemented roughly like this:

SELECT * FROM products WHERE description ILIKE '%%usr_string%%';

This works reasonably well. However, I thought it would be a nice
touch to give my users leeway to spell product names incorrectly when
searching, or to not have to remember if a product is entered as "coca
cola", "CocaCola" or "Coca-cola". At the moment, they don't have to
worry about case sensitivity because I use ILIKE - I'd like to
preserve that. I'd also like to not have it weigh against them heavily
when they don't search for a specific product, but just a common
substring. For example, if they search for "coca-cola", there may be a
number of different coca-cola products: "CocaCola 330ml can",
"Coca-Cola 2 litre bottle", but no actual plain "cocacola". That ought
to not matter too much - all cocacola products should be returned.

This isn't important enough for me to be willing to add a big
dependency to my application. I'd really prefer to limit myself to the
contrib modules. pg_trgm and fuzzystrmatch look very promising, but
it's not obvious how I can use either to achieve what I want.
Postgres's built-in regex support may have a role to play too.

I can live with it not being indexable, because typically there are
only tens of thousands of products in a production system.

Could someone suggest an approach that is reasonably simple and
reasonably generic ?

Thanks,
Peter Geoghegan

В списке pgsql-general по дате отправления:

Предыдущее
От: Heine Ferreira
Дата:
Сообщение: desktop heap usage in windows
Следующее
От: Bill Moran
Дата:
Сообщение: Re: Fuzzy string matching of product names