Обсуждение: querying both text and non-text properties
Hi, I’ve got an application where I’d like to search a collection of objects based on various properties, some text and othersnon-text (bools, enums, ints, etc). I’ve used full text search before, following the PG docs to set up a index on ats_vector. And of course I’ve used normal indexes before for accelerating basic queries that use non-text columns. Any recommendations on combining the two? For example, if I search for objects where the description text matches [some ts_query]and the color = red, I can imagine putting the color property into the text index somehow - maybe with tokens like:’color_red’, ‘color_blue’, but for something like an integer … ? Rob
Rob Nikander wrote: > I’ve got an application where I’d like to search a collection of objects > based on various properties, some text and others non-text (bools, enums, > ints, etc). I’ve used full text search before, following the PG docs to > set up a index on a ts_vector. And of course I’ve used normal indexes > before for accelerating basic queries that use non-text columns. > > Any recommendations on combining the two? For example, if I search for > objects where the description text matches [some ts_query] and the > color = red, I can imagine putting the color property into the text > index somehow - maybe with tokens like: ’color_red’, ‘color_blue’, > but for something like an integer … ? You have two options: A combined index: CREATE EXTENSION btree_gin; CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc), color); That is the perfect match for a query with WHERE color = 'red' AND to_tsvector('german', doc) @@ to_tsquery('english', 'word'); But you can also create two indexes: CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc)); CREATE INDEX ON fulltext (color); Then you don't need the extension, and PostgreSQL can still use them for the search, either only one of them if the condition is selective enough, or a "BitmapAnd" of both. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
> On Dec 4, 2018, at 4:59 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > You have two options: > > A combined index: > > CREATE EXTENSION btree_gin; > CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc), color); > > That is the perfect match for a query with > > WHERE color = 'red' AND to_tsvector('german', doc) @@ to_tsquery('english', 'word'); > > But you can also create two indexes: > > CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc)); > CREATE INDEX ON fulltext (color); > > Then you don't need the extension, and PostgreSQL can still use them for the search, > either only one of them if the condition is selective enough, or a "BitmapAnd" of both. Thanks! I will try both these methods and compare the performance. Rob
I would suggest doing testing out btree_gin with a non-insignificant amount of data before going ahead with it.
I did a test case last month, and the size of the generated index was _much_ bigger than the base table.
The case involved a compound key if 1 int column and 1 timestamp range column.
On Wed, 5 Dec 2018 at 16:02, Rob Nikander <rob.nikander@gmail.com> wrote:
> On Dec 4, 2018, at 4:59 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> You have two options:
>
> A combined index:
>
> CREATE EXTENSION btree_gin;
> CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc), color);
>
> That is the perfect match for a query with
>
> WHERE color = 'red' AND to_tsvector('german', doc) @@ to_tsquery('english', 'word');
>
> But you can also create two indexes:
>
> CREATE INDEX ON fulltext USING gin (to_tsvector('english', doc));
> CREATE INDEX ON fulltext (color);
>
> Then you don't need the extension, and PostgreSQL can still use them for the search,
> either only one of them if the condition is selective enough, or a "BitmapAnd" of both.
Thanks! I will try both these methods and compare the performance.
Rob