Re: Patch: Add tsmatch JSONPath operator for granular Full Text Search

Поиск
Список
Период
Сортировка
От Florents Tselai
Тема Re: Patch: Add tsmatch JSONPath operator for granular Full Text Search
Дата
Msg-id CA+v5N41Rr-18cjSsc3a6YvGTmnaXmfNiQM3e_gvdw76Yj7y6zA@mail.gmail.com
обсуждение исходный текст
Ответ на Patch: Add tsmatch JSONPath operator for granular Full Text Search  (Florents Tselai <florents.tselai@gmail.com>)
Список pgsql-hackers



On Mon, Jan 26, 2026 at 7:22 PM Florents Tselai <florents.tselai@gmail.com> wrote:
Hi,

in real-life I work a lot with json & fts search, here's a feature I've always wished I had,
but never tackle it. Until yesterday that is.

SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body tsmatch "performance")');

This patch introduces a tsmatch boolean operator to the JSONPath engine.
By integrating FTS natively into path expressions,
this operator allows for high-precision filtering of nested JSONB structures—
solving issues with structural ambiguity and query complexity.

Currently, users must choose between two suboptimal paths for FTS-ing nested JSON:
- Imprecise Global Indexing
jsonb_to_tsvector aggregates text into a flat vector.
This ignores JSON boundaries, leading to false positives when the same key (e.g., "body")
appears in different contexts (e.g., a "Product Description" vs. a "Customer Review").

- Complex SQL Workarounds
Achieving 100% precision requires unnesting the document via jsonb_array_elements and LATERAL joins.
This leads to verbose SQL and high memory overhead from generating intermediate heap tuples.

One of the most significant advantages of tsmatch is its ability to participate in multi-condition predicates
within the same JSON object - something jsonb_to_tsvector cannot do.

SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body tsmatch "performance")');

In a flat vector, the association between "Alice" and "performance" is lost.
tsmatch preserves this link by evaluating the FTS predicate in-place during path traversal.

While the SQL/JSON standard (ISO/IEC 9075-2) does not explicitly define an FTS operator,
tsmatch is architecturally modeled after the standard-defined like_regex.

The implementation follows the like_regex precedent:
it is a non-indexable predicate that relies on GIN path-matching for pruning and heap re-checks for precision.
Caching is scoped to the JsonPathExecContext,
ensuring 'compile-once' efficiency per execution without violating the stability requirements of prepared statements.

This initial implementation uses plainto_tsquery.
However, the grammar is designed to support a "mode" flag (similar to like_regex flags)
in future iterations to toggle between to_tsquery, websearch_to_tsquery, and phraseto_tsquery.

Here's a v2, that implements the tsqparser clause 

So this should now work too 

select jsonb_path_query_array('["fast car", "slow car", "fast and furious"]', '$[*] ? (@ tsmatch "fast car" tsqparser "w") 
Вложения

В списке pgsql-hackers по дате отправления: