websearch_to_tsquery fails to transform compound words from a thesaurus dictionary

Поиск
Список
Период
Сортировка
От Jean Gabriel
Тема websearch_to_tsquery fails to transform compound words from a thesaurus dictionary
Дата
Msg-id d9874680-8292-0728-dca0-f9312afd3221@hasbani.ca
обсуждение исходный текст
Список pgsql-bugs

Hello,

Affected versions: PG 11 to 14.3 (all).
Affected OS:  windows 10 + x86_64-pc-linux-gnu (from dbfiddle)

Issue:

Thesaurus dictionary can transform a compound word to another one. The example provided in the doc is "supernovae stars : *sn". When used with websearch_to_tsquery, this transformation does not occur and the original words are kept, **OR**, if there is another single word entry in the thesaurus, this single transformation occurs.

Why it is a problem:

since other text search functions apply the transformation, a document containing the compound word can't be found when using websearch_to_tsquery.


Expected result:

websearch_to_tsquery should transform compound words from the thesaurus

Good to know:

1) the expected behavior occurs with single words from the thesaurus.
2) the bad behavior occurs regardless of pre or post stemming
3) If the compound word is double quoted, websearch_to_tsquery returns the expected output in V14 but a bad one in previous versions.


Steps to reproduce:
create a test_theasaurus.ths file with the lines

supernovae stars : *sn
supernovae : *sn
abc def: xy



CREATE TEXT SEARCH DICTIONARY test_thesaurus (
    TEMPLATE = thesaurus,
    DictFile = test_theasaurus,
    Dictionary = pg_catalog.english_stem
);

CREATE TEXT SEARCH CONFIGURATION public.test ( COPY = pg_catalog.english );

ALTER TEXT SEARCH CONFIGURATION public.test
        ALTER MAPPING FOR hword, hword_part, word, asciihword, hword_asciipart, asciiword
        WITH public.test_thesaurus, english_stem;


select to_tsvector('test','abc def') @@ websearch_to_tsquery('test','abc def'); --FALSE - wrong result
select to_tsvector('test','supernovae stars') @@ websearch_to_tsquery('test','supernovae stars'); --FALSE - wrong result



select websearch_to_tsquery('test','abc def'); --'abc def' --> no transformation occurred
select websearch_to_tsquery('test','supernovae stars'); --'sn' & 'star' --> 1st word is listed by itself in the thesaurus and was transformed

select websearch_to_tsquery('test','"abc def"'); -- 'xy' --> in V14, double quoted compound words are transformed as expected

select to_tsvector('test','abc def'), plainto_tsquery('test','abc def'); --'xy', expected behavior in other functions
select to_tsvector('test','supernovae stars'), plainto_tsquery('test','supernovae stars'); --'sn', expected behavior in other functions


Let me know if there is anything else I can provide!

Thank you for taking the time to look at this issue, it is much appreciated

JG

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #17504: psql --single-transaction -vON_ERROR_STOP=1 still commits after client-side error
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17518: Getting Error "new multixact has more than one updating member" when trying to delete records.