Re: Bad query plan with high-cardinality column

Поиск
Список
Период
Сортировка
От Alexander Staubo
Тема Re: Bad query plan with high-cardinality column
Дата
Msg-id F8FDDA2A2FCB47E8A13F0411BA37311E@gmail.com
обсуждение исходный текст
Ответ на Re: Bad query plan with high-cardinality column  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Bad query plan with high-cardinality column  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On Friday, February 22, 2013 at 21:33 , Tom Lane wrote:
> The reason is that the LIMIT may stop the query before it's scanned all
> of the index. The planner estimates on the assumption that the desired
> rows are roughly uniformly distributed within the created_at index, and
> on that assumption, it looks like this query will stop fairly soon ...
> but evidently, that's wrong. On the other hand, it knows quite well
> that the other plan will require pulling out 5000-some rows and then
> sorting them before it can return anything, so that's not going to be
> exactly instantaneous either.
>
> In this example, I'll bet that conversation_id and created_at are pretty
> strongly correlated, and that most or all of the rows with that specific
> conversation_id are quite far down the created_at ordering, so that the
> search through the index takes a long time to run. OTOH, with another
> conversation_id the same plan might run almost instantaneously.


That's right. So I created a composite index, and not only does this make the plan correct, but the planner now chooses
amuch more efficient plan than the previous index that indexed only on "conversation_id": 

    Limit  (cost=0.00..30.80 rows=13 width=12) (actual time=0.042..0.058 rows=13 loops=1)
                                                    
      Buffers: shared hit=8
                                                    
      ->  Index Scan using index_comments_on_conversation_id_and_created_at on comments  (cost=0.00..14127.83 rows=5964
width=12)(actual time=0.039..0.054 rows=13 loops=1) 
            Index Cond: (conversation_id = 3975979)
                                                    
            Buffers: shared hit=8
                                                    
    Total runtime: 0.094 ms
                                                    


Is this because it can get the value of "created_at" from the index, or is it because it can know that the index is
pre-sorted,or both? 

Very impressed that Postgres can use a multi-column index for this. I just assumed, wrongly, that it couldn't. I will
haveto go review my other tables now and see if they can benefit from multi-column indexes. 

Thanks!


В списке pgsql-performance по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: Bad query plan with high-cardinality column
Следующее
От: Alexander Staubo
Дата:
Сообщение: Re: Bad query plan with high-cardinality column