BUG #18654: From fuzzystrmatch, levenshtein function with costs parameters produce incorrect results
От | PG Bug reporting form |
---|---|
Тема | BUG #18654: From fuzzystrmatch, levenshtein function with costs parameters produce incorrect results |
Дата | |
Msg-id | 18654-c09f568d3ba6dfcd@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #18654: From fuzzystrmatch, levenshtein function with costs parameters produce incorrect results
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 18654 Logged by: bjdev Email address: bjdev.gthb@laposte.net PostgreSQL version: 15.4 Operating system: Ubuntu 22.04.5 LTS Description: Hi, The extension fuzzystrmatch propose an implementation of levenshtein function. There is one version with costs parameters levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int But if we use this function with parameters other than 1 (the default) the result is incorrect SELECT levenshtein('horses','shorse',1,1,1) => 2 (correct) SELECT levenshtein('horses','shorse',100,10,1) => 101 (INCORRECT) The correct result is 6 (all the letter have to be substitute and it's not possible to have a lower score with others operations) Here, it's easy to verify manually but you can check that using python implementation from Levenshtein import distance distance("horses","shorse",weights=(100,10,1)) # => 6 SELECT levenshtein('horses','shorse',1,10,100) => 12 (INCORRECT) The correct result is 11 (insert "s" first (+1) and remove last "s"(+10) Here, it's easy to verify manually but you can check that using python implementation from Levenshtein import distance distance("horses","shorse",weights=(1,10,100)) # => 11 SELECT levenshtein('horses','shorse',1,10,1) => 2 (INCORRECT) The correct result is 6 you can check that using python implementation from Levenshtein import distance distance("horses","shorse",weights=(1,10,1)) # => 6 The use of cost parameters of the levenshtein function is therefore not possible, which is a shame. Regards
В списке pgsql-bugs по дате отправления: