RE: Why is there a Sort after an Index Only Scan?

Поиск
Список
Период
Сортировка
От André Hänsel
Тема RE: Why is there a Sort after an Index Only Scan?
Дата
Msg-id 015101d8601d$37df5190$a79df4b0$@webkr.de
обсуждение исходный текст
Ответ на Re: Why is there a Sort after an Index Only Scan?  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Why is there a Sort after an Index Only Scan?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
> They are sorted by order_id only within sets of the same shipping_date, which is not good enough.

Ah yes, that totally makes sense for the general case.

> so maybe you can just tweak that application to test if the start and end dates are the same and use equality when
theyare. 

I definitely can.

But now I have a followup question, which probably should have been a separate question all along. I have modified the
examplea bit to have a more natural date distribution and I got rid of the weird shipping_date condition and actually
madeit different dates, so the index order is out of the picture. I also added some statistics so Postgres knows about
therelationship between the columns. 

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=54c7774432e896e3c0e89d8084c4b194

After inserting more rows, Postgres still chooses a scan on the primary key instead of using the index.

Limit  (cost=0.43..296.63 rows=50 width=4) (actual time=1052.692..1052.737 rows=50 loops=1)
  ->  Index Scan using orders_test_pkey on orders_test  (cost=0.43..71149.43 rows=12010 width=4) (actual
time=1052.690..1052.728rows=50 loops=1) 
        Filter: ((shipping_date >= '2022-04-30'::date) AND (shipping_date <= '2022-05-01'::date))
        Rows Removed by Filter: 1998734

By setting the CPU costs to 0 (last block in the fiddle) I can force the use of the previous plan and as I already
suspectedit is much better: 

Limit  (cost=101.00..101.00 rows=50 width=4) (actual time=4.835..4.843 rows=50 loops=1)
  ->  Sort  (cost=101.00..101.00 rows=12010 width=4) (actual time=4.833..4.837 rows=50 loops=1)
        Sort Key: order_id
        Sort Method: top-N heapsort  Memory: 27kB
        ->  Index Scan using orders_test_shipping_date_idx on orders_test  (cost=0.00..101.00 rows=12010 width=4)
(actualtime=0.026..3.339 rows=11266 loops=1) 
              Index Cond: ((shipping_date >= '2022-04-30'::date) AND (shipping_date <= '2022-05-01'::date))

Is it overestimating the cost of the sorting?





В списке pgsql-performance по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Why is there a Sort after an Index Only Scan?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Why is there a Sort after an Index Only Scan?