Re: Unable to make use of "deep" JSONB index

Поиск
Список
Период
Сортировка
От Shaheed Haque
Тема Re: Unable to make use of "deep" JSONB index
Дата
Msg-id CAHAc2jfokgv5GZ-eLrAJ8G52gLqSRy_VR-s0R90uJxui-YSQsQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Unable to make use of "deep" JSONB index  (Shaheed Haque <shaheedhaque@gmail.com>)
Ответы Re: Unable to make use of "deep" JSONB index  (Erik Rijkers <er@xs4all.nl>)
Re: Unable to make use of "deep" JSONB index  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-bugs
OK, I have corrected and simplified the test case (including switching
to a btree index). The WHERE clause and the inex now look like this:

    ...WHERE         ((snapshot -> 'employee' -> '999' ->>
'pay_graph')::integer != 0);
    ...USING btree (((snapshot -> 'employee' -> '$.*'  ->>
'pay_graph')::integer != 0));

But the index is still not being used (test case below). I have
confirmed that the equality operator is listed for "search" (I assume
inequality is the same as equality, but I tried both):

# \dAo btree jsonb*
            List of operators of operator families
 AM   | Operator family |    Operator     | Strategy | Purpose
-------+-----------------+-----------------+----------+---------
btree | jsonb_ops       | <(jsonb,jsonb)  |        1 | search
btree | jsonb_ops       | <=(jsonb,jsonb) |        2 | search
btree | jsonb_ops       | =(jsonb,jsonb)  |        3 | search
btree | jsonb_ops       | >=(jsonb,jsonb) |        4 | search
btree | jsonb_ops       | >(jsonb,jsonb)  |        5 | search
(5 rows)

If this is not a bug, then how should the query or the index be
changed to make this work?


=== begin test case ===

CREATE TABLE payrun (
    id serial primary key,
    snapshot JSONB
);

INSERT INTO payrun(snapshot)
VALUES
  ('{"employee": {"999": {"id": 999, "state": {"employment":
[["1920-01-01", null, 3]]}, "pay_graph": 0, "last_run_of_employment":
false}, "111": {"id": 111, "state": {"employment": [["1920-01-01",
null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}'),
  ('{"employee": {"999": {"id": 999, "state": {"employment":
[["1970-01-01", null, 3]]}, "pay_graph": 6, "last_run_of_employment":
true}, "222": {"id": 222, "state": {"employment": [["1920-01-01",
null, 5]]}, "pay_graph": 5, "last_run_of_employment": true}}}'),
  ('{"employee": {"998": {"id": 998, "state": {"employment":
[["1980-01-01", null, 3]]}, "pay_graph": 7, "last_run_of_employment":
false}, "333": {"id": 333, "state": {"employment": [["1920-01-01",
null, 5]]}, "pay_graph": 3, "last_run_of_employment": true}}}')
;

SELECT id,snapshot #>'{employee,999,state,employment}' FROM "payrun"
WHERE ((snapshot -> 'employee' -> '999' ->> 'pay_graph')::integer !=
0);

--
-- Create index designed to match the query.
--
CREATE INDEX idx1 ON payrun USING btree (((snapshot -> 'employee' ->
'$.*' ->> 'pay_graph')::integer != 0));

set enable_seqscan = OFF;

-- 
-- EXPLAIN ANALYZE ...query above...
--
explain analyze SELECT id,snapshot #>'{employee,999,state,employment}'
FROM "payrun" WHERE ((snapshot -> 'employee' -> '999' ->>
'pay_graph')::integer != 0);
                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
Seq Scan on payrun  (cost=10000000000.00..10000000001.08 rows=2
width=36) (actual time=70.051..70.052 rows=1 loops=1)
  Filter: (((((snapshot -> 'employee'::text) -> '999'::text) ->>
'pay_graph'::text))::integer <> 0)
  Rows Removed by Filter: 2
Planning Time: 0.147 ms
JIT:
  Functions: 4
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 0.354 ms, Inlining 8.305 ms, Optimization 49.237
ms, Emission 12.499 ms, Total 70.395 ms
Execution Time: 70.428 ms
(9 rows)

=== end test case ===

Thanks, Shaheed

On Thu, 2 Jun 2022 at 16:51, Shaheed Haque <shaheedhaque@gmail.com> wrote:
>
> On Thu, 2 Jun 2022 at 15:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Shaheed Haque <shaheedhaque@gmail.com> writes:
> > > -- Create index designed to match the query.
> > > --
> > > create index idx1 on payrun using gin ((snapshot->'$.employee.* ?
> > > (@.pay_graph <> 0 || @.last_run_of_employment == true ||
> > > @.state.employment[last][2] == 0)'));
> >
> > But that doesn't match the query; it's not even the same topmost
> > operator:
> >
> > > explain analyse SELECT id,snapshot #>'{employee,999,state,employment}' FROM
> > > "payrun" WHERE (snapshot @? '$.employee."999" ? (@.pay_graph <> 0 ||
> > > @.last_run_of_employment
> > > == true || @.state.employment[last][2] == 0)');
>
> I assume you are referring to the difference between "snapshot @?" and
> "snapshot ->"? If so, apologies: too much cutting and pasting from too
> many experiments. I did in fact also try the "using btree ((snapshot
> @?" form but it gave the same results.
>
> > In general you seem to have much too high an opinion of what PG's
> > index machinery can cope with.  The general pattern is that it can
> > use a query WHERE clause with an index if the clause is of the form
> > "indexed-column indexable-operator constant".  There's a small number
> > of special cases where it can transform things that don't initially
> > look like that into the right form, but AFAIR we don't have any
> > such special cases for any json-related operators.
>
> LOL. I'm pretty much a noob here, so that's very possible.
>
> > The one saving grace is that "indexed-column" can be an expression
> > appearing in an index, so in some cases you can finesse things
> > that way.  But you won't find any deep knowledge of jsonpath
> > expressions in there.
>
> I was basing my efforts on this statement in the docs
> https://www.postgresql.org/docs/14/datatype-json.html#JSON-INDEXING:
>
>     GIN index extracts statements of following form out of jsonpath:
> accessors_chain = const. Accessors chain may consist of .key, [*], and
> [index] accessors. jsonb_ops additionally supports .* and .**
> accessors.
>
> Did I mis-implement, misunderstand or read too much into this?
>
> > Having said that, @? is reported as an indexable operator in v14:
> >
> > regression=# \dAo gin jsonb*
> >              List of operators of operator families
> >  AM  | Operator family |      Operator      | Strategy | Purpose
> > -----+-----------------+--------------------+----------+---------
> >  gin | jsonb_ops       | @>(jsonb,jsonb)    |        7 | search
> >  gin | jsonb_ops       | @?(jsonb,jsonpath) |       15 | search
> >  gin | jsonb_ops       | @@(jsonb,jsonpath) |       16 | search
> >  gin | jsonb_ops       | ?(jsonb,text)      |        9 | search
> >  gin | jsonb_ops       | ?|(jsonb,text[])   |       10 | search
> >  gin | jsonb_ops       | ?&(jsonb,text[])   |       11 | search
> >  gin | jsonb_path_ops  | @>(jsonb,jsonb)    |        7 | search
> >  gin | jsonb_path_ops  | @?(jsonb,jsonpath) |       15 | search
> >  gin | jsonb_path_ops  | @@(jsonb,jsonpath) |       16 | search
> > (9 rows)
> >
> > so it seems like you ought to get some benefit for this query
> > from just a plain GIN index on "snapshot".
>
> Interesting. I'm pretty sure I started there a few days ago without
> any luck but I'll give it another spin (having learnt quite a bit
> since then).
>
> >
> >                         regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Re: BUG #17504: psql --single-transaction -vON_ERROR_STOP=1 still commits after client-side error
Следующее
От: Erik Rijkers
Дата:
Сообщение: Re: Unable to make use of "deep" JSONB index