Re: BRIN index which is much faster never chosen by planner

Поиск
Список
Период
Сортировка
От Jeremy Finzel
Тема Re: BRIN index which is much faster never chosen by planner
Дата
Msg-id CAMa1XUgnHL9760p9ZqSQNG4u=WSbnedgzx453e_QfpXFYDDY3g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BRIN index which is much faster never chosen by planner  (Michael Lewis <mlewis@entrata.com>)
Список pgsql-hackers
Dear Michael,

On Thu, Oct 10, 2019 at 5:20 PM Michael Lewis <mlewis@entrata.com> wrote:
Since the optimizer is choosing a seq scan over index scan when it seems like it has good row estimates in both cases, to me that may mean costs of scanning index are expected to be high. Is this workload on SSD? Has the random_page_cost config been decreased from default 4 (compared with cost of 1 unit for sequential scan)?

It's 1.5
 
Your buffer hits aren't great. What is shared_buffers set to? How much ram on this cluster?

shared_buffers is 4GB.  It has 500G of RAM, but server has several clusters on it.
 

With this table being insert only, one assumes correlation is very high on the data in this column as shown in pg_stats, but have your confirmed?

Yes, but the issue isn't with the BRIN index performing badly or being fragmented.  It's that it performs great (7x faster than the seq scan) but postgres doesn't pick using it.  I have seen this same issue also in other attempts I have made to use BRIN.
 
To me, distinct ON is often a bad code smell and probably can be re-written to be much more efficient with GROUP BY, lateral & order by, or some other tool. Same with the window function. It is a powerful tool, but sometimes not the right one.

I don't really agree, but it's beside the point because the issue is not in aggregation.  It's pre-aggregation.  Indeed if I run my query as a simple select (as I tried) it's the exact same planning issue.  (In my experience, distinct on for given example is the fastest.  Same with window functions which prevent inefficient self-joins)
 
Is "source" a function that is called on field1? What is it doing/how is it defined?

I can't see how that matters either, but the "source" function is a mask for a built-in pg function that is trivial.  This whole query is masked so as not to expose our actual prod query, but I hope it's still understandable enough :).

My question is not how to make this query faster in general.  It's that I want to use BRIN indexes very much, but I'm not sure I can trust they will scale with the right query plan like I know BTREE will.

Thanks!
Jeremy

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Finzel
Дата:
Сообщение: Re: BRIN index which is much faster never chosen by planner
Следующее
От: Jeremy Finzel
Дата:
Сообщение: Re: BRIN index which is much faster never chosen by planner