Обсуждение: Why order by column not using index with distinct keyword in selectclause?

Поиск
Список
Период
Сортировка

Why order by column not using index with distinct keyword in selectclause?

От
Arup Rakshit
Дата:
I have define a simple B Tree index on column *country* for users table. I don’t understand why the order by column not using the index scan when using *distinct* keyword in the select clause. Can anyone explain what is happening here?

aruprakshit=# \d users;
                                     Table "public.users"
   Column   |         Type          | Collation | Nullable |              Default
------------+-----------------------+-----------+----------+-----------------------------------
 city       | character varying     |           |          |
 last_name  | character varying(50) |           |          |
 country    | character varying(50) |           |          |
 sequence   | integer               |           |          |
 first_name | character varying(50) |           |          |
 state      | character varying(50) |           |          |
 email      | character varying     |           |          |
 id         | smallint              |           | not null | nextval('users_id_seq'::regclass)
Indexes:
    "users_pk" PRIMARY KEY, btree (id)

aruprakshit=# explain analyze select distinct country from users order by country asc;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Sort  (cost=283.19..283.85 rows=263 width=11) (actual time=10.525..10.555 rows=263 loops=1)
   Sort Key: country
   Sort Method: quicksort  Memory: 38kB
   ->  HashAggregate  (cost=269.99..272.62 rows=263 width=11) (actual time=8.469..8.521 rows=263 loops=1)
         Group Key: country
         ->  Seq Scan on users  (cost=0.00..244.99 rows=9999 width=11) (actual time=0.022..3.428 rows=9999 loops=1)
 Planning time: 0.358 ms
 Execution time: 10.634 ms
(8 rows)

aruprakshit=# explain analyze select country from users order by country asc;
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using country on users  (cost=0.29..886.27 rows=9999 width=11) (actual time=0.083..7.581 rows=9999 loops=1)
   Heap Fetches: 9999
 Planning time: 0.118 ms
 Execution time: 8.332 ms
(4 rows)

aruprakshit=# explain analyze select * from users order by country asc;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using country on users  (cost=0.29..886.27 rows=9999 width=73) (actual time=0.015..8.432 rows=9999 loops=1)
 Planning time: 0.213 ms
 Execution time: 9.086 ms
(3 rows)

aruprakshit=#


Thanks,

Arup Rakshit



Re: Why order by column not using index with distinct keyword in select clause?

От
Tom Lane
Дата:
Arup Rakshit <ar@zeit.io> writes:
> I have define a simple B Tree index on column *country* for users table. I don’t understand why the order by column
notusing the index scan when using *distinct* keyword in the select clause. Can anyone explain what is happening here? 

Doesn't look complicated to me: the cost of the hashagg+sort plan is
283.85, while the cost of the indexscan plans is 886.27, and adding
a Uniq node on top of them would have been even more.

The critical point here seems to be that the hashagg node is able to
get rid of so many rows that the cost of sorting what remains is low.

Also notable is that the IOS isn't giving you any advantage over a
plain indexscan, which apparently is because none of the table's
pages are marked all-visible.  If the table is static then a VACUUM
would help that.

            regards, tom lane


Re: Why order by column not using index with distinct keyword inselect clause?

От
Geoff Winkless
Дата:
On Tue, 11 Sep 2018 at 13:56, Arup Rakshit <ar@zeit.io> wrote:
I have define a simple B Tree index on column *country* for users table. I don’t understand why the order by column not using the index scan when using *distinct* keyword in the select clause. Can anyone explain what is happening here?

Bear in mind that index skip scans aren't yet implemented, which (unless I've missed the point) seems to be what you're expecting to help here.


Geoff