Обсуждение: Hstore index for full text search
Hi all,
I need help in full text search optimization for hstore type. I added my query to explain.depesz, you can check the query and also i added explain analyze result in this link: https://explain.depesz.com/s/QT1e
table_ord.ops column type is hstore. I couldn't find the effective index that would reduce the run time of the query.
When I tried to add an gin index to the hstore column, I got the following error:
create index on table_ord USING gin (ops);
ERROR: index row size 2728 exceeds maximum 2712 for index " table_ord_ ops_idx"
How can I fix this query ?
Best Regards,
Burhan Akbulut
DBA - Cooksoft
table_ord.ops column type is hstore. I couldn't find the effective index that would reduce the run time of the query.
When I tried to add an gin index to the hstore column, I got the following error:
create index on table_ord USING gin (ops);
ERROR: index row size 2728 exceeds maximum 2712 for index " table_ord_ ops_idx"
How can I fix this query ?
Best Regards,
Burhan Akbulut
DBA - Cooksoft
Hash Cond: (o.courier_id = cc.id)
Filter: (((o.tracker_code)::text ~~* '%1654323%'::text) OR ((table_cus.name)::text ~~* '%1654323%'::text) OR ((au.username)::text ~~ '%1654323%'::text) OR ((o.source)::text ~~* '%1654323%'::text) OR ((o.ops -> 'shop'::text) ~~* '%1654323%'::text) OR ((o.ops -> 'camp_code'::text) ~~* '%1654323%'::text) OR ((city.name)::text ~~* '%1654323%'::text) OR ((co.name)::text ~~* '%1654323%'::text) OR ((o.tr_code)::text ~~* '%1654323%'::text) OR ((o.ops ? 'shipping_company'::text) AND ((o.ops -> 'shipping_company'::text) ~~* '%1654323%'::text)) OR ((cc.name)::text ~~* '%1654323%'::text))
Filter: (((o.tracker_code)::text ~~* '%1654323%'::text) OR ((table_cus.name)::text ~~* '%1654323%'::text) OR ((au.username)::text ~~ '%1654323%'::text) OR ((o.source)::text ~~* '%1654323%'::text) OR ((o.ops -> 'shop'::text) ~~* '%1654323%'::text) OR ((o.ops -> 'camp_code'::text) ~~* '%1654323%'::text) OR ((city.name)::text ~~* '%1654323%'::text) OR ((co.name)::text ~~* '%1654323%'::text) OR ((o.tr_code)::text ~~* '%1654323%'::text) OR ((o.ops ? 'shipping_company'::text) AND ((o.ops -> 'shipping_company'::text) ~~* '%1654323%'::text)) OR ((cc.name)::text ~~* '%1654323%'::text))
All those OR conditions on different tables and fields seems like it will be unlikely that the planner will do anything with the index you are trying to create (for this query).
On the error, I came across discussions on dba.stackexchange.com referencing a limit of about 1/3 of the page size (8192) for every key because of it being a btree underneath. It could be one or more of your keys in ops (like shop, camp_code, and shipping_company) is much longer than those examples shown in the query.
On the error, I came across discussions on dba.stackexchange.com referencing a limit of about 1/3 of the page size (8192) for every key because of it being a btree underneath. It could be one or more of your keys in ops (like shop, camp_code, and shipping_company) is much longer than those examples shown in the query.
Michael Lewis <mlewis@entrata.com> writes: > Hash Cond: (o.courier_id = cc.id) > Filter: (((o.tracker_code)::text ~~* '%1654323%'::text) OR > ((table_cus.name)::text > ~~* '%1654323%'::text) OR ((au.username)::text ~~ '%1654323%'::text) OR > ((o.source)::text ~~* '%1654323%'::text) OR ((o.ops -> 'shop'::text) ~~* > '%1654323%'::text) OR ((o.ops -> 'camp_code'::text) ~~* '%1654323%'::text) > OR ((city.name)::text ~~* '%1654323%'::text) OR ((co.name)::text ~~* > '%1654323%'::text) OR ((o.tr_code)::text ~~* '%1654323%'::text) OR ((o.ops > ? 'shipping_company'::text) AND ((o.ops -> 'shipping_company'::text) ~~* > '%1654323%'::text)) OR ((cc.name)::text ~~* '%1654323%'::text)) > All those OR conditions on different tables and fields seems like it will > be unlikely that the planner will do anything with the index you are trying > to create (for this query). A GIN index on an hstore column only provides the ability to search for exact matches to hstore key strings. There are a few bells and whistles, like the ability to AND or OR such conditions. But basically it's just an exact-match engine, and it doesn't index the hstore's data values at all (which is why the implementors weren't too concerned about having a length limit on the index entries). There is 0 chance of this index type being useful for what the OP wants to do. Given these examples, I'd think about setting up a collection of pg_trgm indexes on the specific hstore keys you care about, ie something like CREATE INDEX ON mytable USING GIST ((ops -> 'camp_code') gist_trgm_ops); CREATE INDEX ON mytable USING GIST ((ops -> 'shipping_company') gist_trgm_ops); ... which'd allow indexing queries like ... WHERE (ops -> 'camp_code') LIKE '%1654323%' OR (ops -> 'shipping_company') LIKE '%1654323%' I'm not sure how far this will get you, though; if there's a whole lot of different keys of interest, maintaining a separate index for each one is probably too much overhead. Another point is that you will only get an indexscan if *every* OR'd clause matches some index. The example query looks sufficiently unstructured that that might be hard to ensure. I kind of wonder whether this data design is actually a good idea. It doesn't seem to match your querying style terribly well. regards, tom lane
On Tue, Aug 11, 2020 at 4:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
A GIN index on an hstore column only provides the ability to search for
exact matches to hstore key strings. There are a few bells and whistles,
like the ability to AND or OR such conditions. But basically it's just an
exact-match engine, and it doesn't index the hstore's data values at all
(which is why the implementors weren't too concerned about having a length
limit on the index entries). There is 0 chance of this index type being
useful for what the OP wants to do.
Thanks for sharing. More like json path ops and not the full key and value. Interesting.
Another point is that you will only
get an indexscan if *every* OR'd clause matches some index. The example
query looks sufficiently unstructured that that might be hard to ensure.
Does this still apply when the where clauses are on several tables and not just one?
Michael Lewis <mlewis@entrata.com> writes: > On Tue, Aug 11, 2020 at 4:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Another point is that you will only >> get an indexscan if *every* OR'd clause matches some index. The example >> query looks sufficiently unstructured that that might be hard to ensure. > Does this still apply when the where clauses are on several tables and not > just one? Yeah. In that case there's no hope of an indexscan at all, since for all the planner knows, the query might match some table rows that don't meet any of the conditions mentioned for that table's columns. If you can write WHERE (condition-on-t1.a OR condition-on-t1.b OR ...) AND (condition-on-t2.x OR condition-on-t2.y OR ...) then you've got a chance of making the OR'd conditions into index qualifications on t1 or t2 respectively. But if it's WHERE condition-on-t1.a OR condition-on-t1.b OR ... OR condition-on-t2.x OR condition-on-t2.y OR ... then you're in for full-table scans. (This is another thing that was bothering me about the data design, though I failed to think it through clearly enough to state before.) regards, tom lane