Why is PostgreSQL 9.1 not using index for simple equality select

Поиск
Список
Период
Сортировка
От Yang Zhang
Тема Why is PostgreSQL 9.1 not using index for simple equality select
Дата
Msg-id CAKxBDU8u8sOWy-hSoM7YCR-AhzbN+cYFpS0oc-s0yJbXg2uG_Q@mail.gmail.com
обсуждение исходный текст
Ответы Re: Why is PostgreSQL 9.1 not using index for simple equality select
Re: Why is PostgreSQL 9.1 not using index for simple equality select
Список pgsql-general
Any hints with this question I had posted to SO?

http://stackoverflow.com/questions/15965785/why-is-postgresql-9-1-not-using-index-for-simple-equality-select

Pasted here as well.  Thanks.

My table `lead` has an index:

    \d lead
    ...
    Indexes:
        "lead_pkey" PRIMARY KEY, btree (id)
        "lead_account__c" btree (account__c)
        ...
        "lead_email" btree (email)
        "lead_id_prefix" btree (id text_pattern_ops)

Why doesn't PG (9.1) use the index for this straightforward equality
selection?  Emails are almost all unique....

    db=> explain select * from lead where email = 'blah';
                             QUERY PLAN
    ------------------------------------------------------------
     Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
       Filter: (email = 'blah'::text)
    (2 rows)

Other index-hitting queries seem to be OK (though I don't know why
this one doesn't just use the pkey index):

    db=> explain select * from lead where id = '';
                                      QUERY PLAN
    ------------------------------------------------------------------------------
     Index Scan using lead_id_prefix on lead  (cost=0.00..8.57 rows=1
width=5108)
       Index Cond: (id = ''::text)
    (2 rows)

    db=> explain select * from lead where account__c = '';
                                        QUERY PLAN
    ----------------------------------------------------------------------------------
     Index Scan using lead_account__c on lead  (cost=0.00..201.05
rows=49 width=5108)
       Index Cond: (account__c = ''::text)
    (2 rows)

At first I thought it may be due to not enough distinct values of
`email`.  For instance, if the stats claim that `email` is `blah` for
most of the table, then a seq scan is faster.  But that's not the
case:

    db=> select count(*), count(distinct email) from lead;
     count  | count
    --------+--------
     749148 | 733416
    (1 row)

Even if I force seq scans to be off, the planner behaves as if it has
no other choice:

    db=> set enable_seqscan = off;
    SET
    db=> show enable_seqscan;
     enable_seqscan
    ----------------
     off
    (1 row)

    db=> explain select * from lead where email = 'foo@blah.com';
                                QUERY PLAN
    ---------------------------------------------------------------------------
     Seq Scan on lead  (cost=10000000000.00..10000319599.38 rows=1 width=5108)
       Filter: (email = 'foo@blah.com'::text)
    (2 rows)

I searched over a good number of past SO questions but none were about
a simple equality query like this one.


В списке pgsql-general по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: How to convert US date format to European date format ?
Следующее
От: John R Pierce
Дата:
Сообщение: Re: Why is PostgreSQL 9.1 not using index for simple equality select