Re: Unable to make use of "deep" JSONB index

Поиск
Список
Период
Сортировка
От Erik Rijkers
Тема Re: Unable to make use of "deep" JSONB index
Дата
Msg-id 25a9ac9c-e4ab-6914-f51f-b999ef4acf09@xs4all.nl
обсуждение исходный текст
Ответ на Re: Unable to make use of "deep" JSONB index  (Shaheed Haque <shaheedhaque@gmail.com>)
Список pgsql-bugs
Op 12-06-2022 om 21:31 schreef Shaheed Haque:

> Thanks Erik. Is the point that the index has to be on the JSON field
> as a whole (i.e. "snapshot") rather than deep inside it (e.g.
> "snapshot.something.further[down]")?
> 
> In my case, the snapshot is several MB in size (perhaps 10MB or even
> 20MB), dominated by the snapshot.employee (cardinality 10k, each sized
> as a dict 1-2kB as text). My expectation/guess is that an index of
> "snapshot" will itself be of a size of similar order. However the
> design as-is works very well except for this one case where to speed
> it up, in principle, the index need contain no more than one boolean
> per employee. So that's what I'd like to achieve, if possible.
> 
> I've seen no hint in the documentation that creating the index on
> "snapshot.something.further[down]" should not work, and PG certainly
> allows it to be created. Also, Tom has suggested that I should not
> look to some magical ability to infer the use of the index from a
> differently structured query, and I've taken that on board with the
> updated query + index.

If you insist on a btree/integer on pay_graph, the below stuff seems to 
work, no?

Perhaps you've missed the difference
between
   x != 0  on the one hand,
and
   x > 0  on the other,

Here are both queries to show that difference.


CREATE INDEX payrun_btree_paygr_idx ON payrun using btree 
(((snapshot->'employee'->'999'->>'pay_graph')::integer));

set enable_seqscan = 0;

SELECT id, snapshot #>'{employee,999,state,employment}'
FROM payrun
WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer > 0 ;
  id |         ?column?
----+---------------------------
   2 | [["1970-01-01", null, 3]]
(1 row)

Time: 1.384 ms

explain analyze  SELECT id, snapshot #>'{employee,999,state,employment}'
FROM payrun WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer > 
0 ;
                                                            QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------
  Index Scan using payrun_btree_paygr_idx on payrun  (cost=0.14..8.98 
rows=42 width=36) (actual time=0.018..0.020 rows=1 loops=1)
    Index Cond: (((((snapshot -> 'employee'::text) -> '999'::text) ->> 
'pay_graph'::text))::integer > 0)
  Planning Time: 0.047 ms
  Execution Time: 0.048 ms
(4 rows)

Time: 0.386 ms
explain analyze  SELECT id, snapshot #>'{employee,999,state,employment}'
FROM payrun WHERE (snapshot->'employee'->'999'->>'pay_graph')::integer 
!= 0 ;
                                                         QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------
  Seq Scan on payrun  (cost=10000000000.00..10000000004.49 rows=126 
width=36) (actual time=112.182..112.186 rows=1 loops=1)
    Filter: (((((snapshot -> 'employee'::text) -> '999'::text) ->> 
'pay_graph'::text))::integer <> 0)
    Rows Removed by Filter: 3
  Planning Time: 0.050 ms
  JIT:
    Functions: 4
    Options: Inlining true, Optimization true, Expressions true, 
Deforming true
    Timing: Generation 0.636 ms, Inlining 41.761 ms, Optimization 52.033 
ms, Emission 18.228 ms, Total 112.658 ms
  Execution Time: 153.486 ms
(9 rows)

Time: 153.835 ms

So:
     0.386 ms  for x > 0   uses payrun_btree_paygr_idx
vs:
   153.835 ms  for x != 0  uses seq scan

It's hard to see for me where you encounter problems; including the fast 
searches in my earier mail, it looks to me like you have many good options.

Erik Rijkers



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: BUG #17504: psql --single-transaction -vON_ERROR_STOP=1 still commits after client-side error
Следующее
От: Onur Tirtir
Дата:
Сообщение: DELETE deletes more than one rows when LIMIT is used in the USING clause