Обсуждение: create index on a jsonb timestamp field?
I'm playing with DNSTAP (dnstap.info) data and loading it into a database for analysis. when I try to create an index on the query_time field of the json structure I get: ler=# select id,data->'message'->>'query_time' from dns_query limit 2; id | ?column? ----+----------------------------- 2 | 2019-05-13T01:35:59.822984Z 3 | 2019-05-13T01:35:59.829801Z (2 rows) ler=# select id,(data->'message'->>'query_time')::timestamptz from dns_query limit 2; id | timestamptz ----+------------------------------- 2 | 2019-05-12 20:35:59.822984-05 3 | 2019-05-12 20:35:59.829801-05 (2 rows) ler=# ler=# create index dns_query_time_idx on dns_query(((data -> 'message' ->> 'query_time')::text::timestamptz)); ERROR: functions in index expression must be marked IMMUTABLE Is there any easy way to do this? Or, what would the experts recommend here? -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On 05/18/2019 5:53 pm, Larry Rosenman wrote: > I'm playing with DNSTAP (dnstap.info) data and loading it into a > database for analysis. > > when I try to create an index on the query_time field of the json > structure I get: > > ler=# select id,data->'message'->>'query_time' from dns_query limit 2; > id | ?column? > ----+----------------------------- > 2 | 2019-05-13T01:35:59.822984Z > 3 | 2019-05-13T01:35:59.829801Z > (2 rows) > > ler=# select id,(data->'message'->>'query_time')::timestamptz from > dns_query limit 2; > id | timestamptz > ----+------------------------------- > 2 | 2019-05-12 20:35:59.822984-05 > 3 | 2019-05-12 20:35:59.829801-05 > (2 rows) > > ler=# > > ler=# create index dns_query_time_idx on dns_query(((data -> 'message' > ->> 'query_time')::text::timestamptz)); > ERROR: functions in index expression must be marked IMMUTABLE > > Is there any easy way to do this? Or, what would the experts recommend > here? I found a work-around in making a column for query time, and populating that in an UPDATE/INSERT trigger, and then making an index on that. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Larry Rosenman <ler@lerctr.org> writes:
> when I try to create an index on the query_time field of the json
> structure I get:
> ler=# create index dns_query_time_idx on dns_query(((data -> 'message'
> ->> 'query_time')::text::timestamptz));
> ERROR: functions in index expression must be marked IMMUTABLE
Yeah, because the timestamptz input function has dependencies on
both the datestyle and timezone GUCs. Given that your input is
ISO-format with explicit time zone, you don't really care about
either of those things, but the mutability check doesn't know that.
> Is there any easy way to do this? Or, what would the experts recommend
> here?
The sanest way to deal with this IMO is to make a column containing
the extracted timestamp, which you could maintain with a trigger,
and then index that. You could alternatively make a custom function
that you (mis?)label as immutable, but your queries would have to
use that same function in order to get matched to the index, so
I dunno about that being a user-friendly approach.
BTW, I'd had the idea that the GENERATED option in PG v13 would allow
setting up this sort of case without bothering with a handwritten trigger,
but it seems not:
regression=# create table foo(data jsonb, ts timestamptz GENERATED ALWAYS AS ((data->>'ts')::timestamptz) stored);
psql: ERROR: generation expression is not immutable
I wonder if that's really necessary to insist on?
regards, tom lane
On 05/18/2019 8:17 pm, Tom Lane wrote: > Larry Rosenman <ler@lerctr.org> writes: >> when I try to create an index on the query_time field of the json >> structure I get: >> ler=# create index dns_query_time_idx on dns_query(((data -> 'message' >> ->> 'query_time')::text::timestamptz)); >> ERROR: functions in index expression must be marked IMMUTABLE > > Yeah, because the timestamptz input function has dependencies on > both the datestyle and timezone GUCs. Given that your input is > ISO-format with explicit time zone, you don't really care about > either of those things, but the mutability check doesn't know that. > >> Is there any easy way to do this? Or, what would the experts >> recommend >> here? > > The sanest way to deal with this IMO is to make a column containing > the extracted timestamp, which you could maintain with a trigger, > and then index that. You could alternatively make a custom function > that you (mis?)label as immutable, but your queries would have to > use that same function in order to get matched to the index, so > I dunno about that being a user-friendly approach. This is what I wound up figuring out between my original post and yours, and it works great. > > BTW, I'd had the idea that the GENERATED option in PG v13 would allow > setting up this sort of case without bothering with a handwritten > trigger, > but it seems not: > > regression=# create table foo(data jsonb, ts timestamptz GENERATED > ALWAYS AS ((data->>'ts')::timestamptz) stored); > psql: ERROR: generation expression is not immutable > > I wonder if that's really necessary to insist on? Good question. Is that something the project is going to look into? > > regards, tom lane -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Larry Rosenman <ler@lerctr.org> writes:
> On 05/18/2019 8:17 pm, Tom Lane wrote:
>> BTW, I'd had the idea that the GENERATED option in PG v13 would allow
>> setting up this sort of case without bothering with a handwritten
>> trigger, but it seems not:
>>
>> regression=# create table foo(data jsonb, ts timestamptz GENERATED
>> ALWAYS AS ((data->>'ts')::timestamptz) stored);
>> psql: ERROR: generation expression is not immutable
>>
>> I wonder if that's really necessary to insist on?
> Good question. Is that something the project is going to look into?
Well, I was just asking the question, not opining on whether it
was right or wrong. GENERATED is a SQL-spec feature, and it might
be that the semantics the spec calls for wouldn't work without
the restriction. I've not looked...
regards, tom lane