Re: Unable to make use of "deep" JSONB index

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Unable to make use of "deep" JSONB index
Дата
Msg-id CAMkU=1wWSX_J+2ThKtvZXPqqSdFLHjsVC19DB0zAa2XhtaULGw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Unable to make use of "deep" JSONB index  (Shaheed Haque <shaheedhaque@gmail.com>)
Ответы Re: Unable to make use of "deep" JSONB index  (Shaheed Haque <shaheedhaque@gmail.com>)
Список pgsql-bugs
On Sun, Jun 12, 2022 at 3:31 PM Shaheed Haque <shaheedhaque@gmail.com> wrote:
(Resend, wrong version was sent before)

Thanks Erik. Is the point that the index has to be on the JSON field
as a whole (i.e. "snapshot") rather than deep inside it (e.g.
"snapshot.something.further[down]")?

In my case, the snapshot is several MB in size (perhaps 10MB or even
20MB), dominated by the snapshot.employee (cardinality 10k, each sized
as a dict 1-2kB as text). My expectation/guess is that an index of
"snapshot" will itself be of a size of similar order. However the
design as-is works very well except for this one case where to speed
it up, in principle, the index need contain no more than one boolean
per employee. So that's what I'd like to achieve, if possible.

It sounds like what you really want here is to extract just the list of the ids which meet one of your three further criteria, and index that list.

You can do that with jsonpath, but you have to apply it with a function, not one of the boolean-returning operators.

This almost works to do that:

create index on payrun using gin (jsonb_path_query_array(snapshot, '$.employee.* ? (@.pay_graph <> 0 || @.last_run_of_employment == true || @.state.employment[last][2] == 0).id'));

Then query it like:

select * from payrun where jsonb_path_query_array(snapshot, '$.employee.* ? (@.pay_graph <> 0 || @.last_run_of_employment == true || @.state.employment[last][2] == 0).id') ? '999';

The problem is that ? only tests for top-level text values, while in your example the value for the 'id' keys are ints, not text.  So they are not findable with the ? operator. If I edit your example data to wrap the ids' values in double quotes, turning them into json strings rather than json ints, then this does work for me.

Maybe there is a way to modify the jsonpath so that it converts the ints to text for you.  But if there is such a way, I don't know what it is.
 
If I were doing this for my own system, I would probably create an immutable user-defined-function which took a jsonb and returned an int[] of the filtered id values.  Then you would have to query it with @> rather than ?.


- I've not done things right, in which case I'd love to know my mistake.
- It is not supposed to work, in which case it would be good to have
that stated, and maybe have PG not allow useless indices to be
created.
- It is a bug.

I guess it is number one mixed with number two.  The index you created is useless for your intended purpose, but is not useless for every conceivable purpose.  It is not realistic to expect PostgreSQL to reject things just because it is not obvious (to a computer) what you are getting up to.

Cheers,

Jeff

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Unable to make use of "deep" JSONB index
Следующее
От: Noah Misch
Дата:
Сообщение: Re: Extension pg_trgm, permissions and pg_dump order