Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Дата
Msg-id CANP8+jKeswEqRUOdRJAhubL=FFxVRq9zmhwt3gPg2_7EFAP3xw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Список pgsql-performance
On Thu, 20 Jun 2019 at 17:30, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Jun 20, 2019 at 05:18:33PM +0100, Simon Riggs wrote:
> On Thu, 20 Jun 2019 at 17:01, Chris Wilson <chris.wilson@cantabcapital.com>
> wrote:
>
>
> > I deliberately included r in the index, to demonstrate the issue that I’m
> > seeing. I know that there is very little locality in this particular,
> > dummy, arbitrary test case. I can try to produce a test case that has some
> > locality, but I expect it to show exactly the same results, i.e. that the
> > BRIN index performs much worse when we try to query on this column as well.
> >
>
> I'm suggesting that adding the second column to the index is the source of
> your problem, not adding the column to the query.

But it *is* odd that the index returns more rows with a strictly tighter
conditions, right ?

Oh, very. I was seeing this as an optimization issue rather than a bug report.
 
Note, it's not an issue of rowcount estimate being confused by redundant
conditions, but real rowcount, and it returns more rows even when the
conditions are duplicative.  Compare:

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1);
...
   ->  Bitmap Index Scan on brin_test_id_r_idx  (cost=0.00..12.03 rows=28125 width=0) (actual time=0.136..0.137 rows=37120 loops=1)
         Index Cond: ((id >= 90000) AND (r = 1))

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1,1);
...
   ->  Bitmap Index Scan on brin_test_id_r_idx  (cost=0.00..12.03 rows=28125 width=0) (actual time=0.263..0.263 rows=74240 loops=1)
         Index Cond: ((id >= 90000) AND (r = ANY ('{1,1}'::integer[])))

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1,1,1);
...
   ->  Bitmap Index Scan on brin_test_id_r_idx  (cost=0.00..12.03 rows=28125 width=0) (actual time=0.387..0.387 rows=111360 loops=1)
         Index Cond: ((id >= 90000) AND (r = ANY ('{1,1,1}'::integer[])))

Note, the docs say:
https://www.postgresql.org/docs/devel/indexes-multicolumn.html
|A multicolumn BRIN index can be used with query conditions that involve any
|subset of the index's columns. Like GIN and unlike B-tree or GiST, index search
|effectiveness is the same regardless of which index column(s) the query
|conditions use. The only reason to have multiple BRIN indexes instead of one
|multicolumn BRIN index on a single table is to have a different pages_per_range
|storage parameter.

The min/max values of each column are held for each block range.

If it scans using the "r" column it will identify more block ranges to scan than if it used the id column and hence would scan more real rows, so that part is understandable.

The only question is why it chooses to scan on "r" and not "id", which needs some investigation.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Solutions for the Enterprise

В списке pgsql-performance по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Следующее
От: Michael Lewis
Дата:
Сообщение: Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction