Re: Unable to make use of "deep" JSONB index
От | Erik Rijkers |
---|---|
Тема | Re: Unable to make use of "deep" JSONB index |
Дата | |
Msg-id | 25a9ac9c-e4ab-6914-f51f-b999ef4acf09@xs4all.nl обсуждение исходный текст |
Ответ на | Re: Unable to make use of "deep" JSONB index (Shaheed Haque <shaheedhaque@gmail.com>) |
Список | pgsql-bugs |
Op 12-06-2022 om 21:31 schreef Shaheed Haque: > Thanks Erik. Is the point that the index has to be on the JSON field > as a whole (i.e. "snapshot") rather than deep inside it (e.g. > "snapshot.something.further[down]")? > > In my case, the snapshot is several MB in size (perhaps 10MB or even > 20MB), dominated by the snapshot.employee (cardinality 10k, each sized > as a dict 1-2kB as text). My expectation/guess is that an index of > "snapshot" will itself be of a size of similar order. However the > design as-is works very well except for this one case where to speed > it up, in principle, the index need contain no more than one boolean > per employee. So that's what I'd like to achieve, if possible. > > I've seen no hint in the documentation that creating the index on > "snapshot.something.further[down]" should not work, and PG certainly > allows it to be created. Also, Tom has suggested that I should not > look to some magical ability to infer the use of the index from a > differently structured query, and I've taken that on board with the > updated query + index. If you insist on a btree/integer on pay_graph, the below stuff seems to work, no? Perhaps you've missed the difference between x != 0 on the one hand, and x > 0 on the other, Here are both queries to show that difference. CREATE INDEX payrun_btree_paygr_idx ON payrun using btree (((snapshot->'employee'->'999'->>'pay_graph')::integer)); set enable_seqscan = 0; SELECT id, snapshot #>'{employee,999,state,employment}' FROM payrun WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer > 0 ; id | ?column? ----+--------------------------- 2 | [["1970-01-01", null, 3]] (1 row) Time: 1.384 ms explain analyze SELECT id, snapshot #>'{employee,999,state,employment}' FROM payrun WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer > 0 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Index Scan using payrun_btree_paygr_idx on payrun (cost=0.14..8.98 rows=42 width=36) (actual time=0.018..0.020 rows=1 loops=1) Index Cond: (((((snapshot -> 'employee'::text) -> '999'::text) ->> 'pay_graph'::text))::integer > 0) Planning Time: 0.047 ms Execution Time: 0.048 ms (4 rows) Time: 0.386 ms explain analyze SELECT id, snapshot #>'{employee,999,state,employment}' FROM payrun WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer != 0 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Seq Scan on payrun (cost=10000000000.00..10000000004.49 rows=126 width=36) (actual time=112.182..112.186 rows=1 loops=1) Filter: (((((snapshot -> 'employee'::text) -> '999'::text) ->> 'pay_graph'::text))::integer <> 0) Rows Removed by Filter: 3 Planning Time: 0.050 ms JIT: Functions: 4 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 0.636 ms, Inlining 41.761 ms, Optimization 52.033 ms, Emission 18.228 ms, Total 112.658 ms Execution Time: 153.486 ms (9 rows) Time: 153.835 ms So: 0.386 ms for x > 0 uses payrun_btree_paygr_idx vs: 153.835 ms for x != 0 uses seq scan It's hard to see for me where you encounter problems; including the fast searches in my earier mail, it looks to me like you have many good options. Erik Rijkers
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Kyotaro HoriguchiДата:
Сообщение: Re: BUG #17504: psql --single-transaction -vON_ERROR_STOP=1 still commits after client-side error
Следующее
От: Onur TirtirДата:
Сообщение: DELETE deletes more than one rows when LIMIT is used in the USING clause