Re: Indexing on JSONB field not working

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Indexing on JSONB field not working
Дата
Msg-id 21855.1576943356@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Indexing on JSONB field not working  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Indexing on JSONB field not working  (Zhihong Zhang <zhihong@gmail.com>)
Список pgsql-bugs
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Fri, Dec 20, 2019 at 08:15:39PM -0500, Zhihong Zhang wrote:
>> "Limit  (cost=0.43..313.25 rows=100 width=53) (actual time=0.092..0.236 rows=7 loops=1)"
>> "  ->  Index Scan using assets_floatvalue_idx on assets  (cost=0.43..2714072.57 rows=867607 width=53) (actual
time=0.089..0.230rows=7 loops=1)" 
>> "        Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)"

> Well, this confirms what I suspected before - the optimizer believes the
> seqscan plan is a bit cheaper compared to index scan (107 vs. 313) but
> that mismatches the actual performance. The question is why ...

The planner evidently believes that 867607 rows will match the query
condition, so it expects that the scan will stop (after collecting
100 rows) very quickly.  In reality only 7 rows match, so the scan
has to run to completion.  This is what's bollixing the plan choice.

I suspect that 867607 is just a default estimate, but if ANALYZE has
been run then there should be stats for the index column, so why isn't
it doing better?  When I try a similar case here, I get good estimates:

regression=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
(1 row)

regression=# create table public.assets(_doc jsonb);
CREATE TABLE
regression=# CREATE INDEX on assets (((_doc #> '{floatValue}'::text[])::double precision) ASC NULLS LAST);
CREATE INDEX
regression=# insert into assets select ('{"floatValue": ' || x || '}')::jsonb from generate_series(1,10000) x;
INSERT 0 10000
regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision <
3.0limit 100; 
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.71 rows=100 width=32)
   ->  Seq Scan on assets  (cost=0.00..293.44 rows=3808 width=32)
         Filter: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
(3 rows)

regression=# analyze assets;
ANALYZE
regression=# explain select _doc->>'floatValue' from assets where (_doc #> '{floatValue}'::text[])::double precision <
3.0limit 100; 
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..8.33 rows=2 width=32)
   ->  Index Scan using assets_float8_idx on assets  (cost=0.29..8.33 rows=2 width=32)
         Index Cond: (((_doc #> '{floatValue}'::text[]))::double precision < '3'::double precision)
(3 rows)

The "3808" estimate is just a default for '<' with no stats, but
with stats I get a dead-on estimate.

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Indexing on JSONB field not working
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: BUG #16151: startup timing problem