Re: Bad query performance with more conditions?

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Bad query performance with more conditions?
Дата
Msg-id 71317720-55C7-4BB4-AA74-90EEB91086FF@solfertje.student.utwente.nl
обсуждение исходный текст
Ответ на Fwd: Bad query performance with more conditions?  ("kaifeng.zhu" <cafeeee@gmail.com>)
Список pgsql-general
On 24 Jun 2010, at 11:13, kaifeng.zhu wrote:

> <resend to mailing list>
>
> On Thu, Jun 24, 2010 at 16:57, Thom Brown <thombrown@gmail.com> wrote:
>> Sounds like the planner took a wrong turn in the 2nd case.  Which
>> version of PostgreSQL are you running?
>
> PostgreSQL version 8.1.21 (With schemas)
>
>
> The explain commands show that:

For a next time, explain analyse would have been more useful.

>
> db1=> explain select * from emails where email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2' limit 1;
>                                                      QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.00..4.00 rows=1 width=184)
>   ->  Index Scan using idx_emails_email_msg_id on emails
> (cost=0.00..370016.42 rows=92413 width=184)

I think your problem is here. I assume that msg-id's are fairly unique between messages, but the planner expects there
areover 90,000 records matching this ID. Hence the planner thinks this index has a low selectivity. 

>         Index Cond: (email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2'::bpchar)
> (3 rows)
>
> db1=> explain select * from emails where email_msg_id =
> '4dba381b-f55e-02d3-4b4a-95e2e98178e2' and email_sender =
> 'sender@domain.com' limit 1;
>
>    QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..801.06 rows=1 width=184)
>   ->  Index Scan using idx_emails_email_sender on emails
> (cost=0.00..370089.46 rows=462 width=184)

For this index the planner only expects 462 rows, so it has a higher selectivity than the index on msg_id and therefore
theplanner prefers this index over the other one. 

>         Index Cond: (email_sender = 'sender@domain.com'::text)
>         Filter: (email_msg_id = '4dba381b-f55e-02d3-4b4a-95e2e98178e2'::bpchar)

Apparently there are quite a few records matching that sender, and the database has to check each of them whether they
havethe requested msg_id or not. Apparently the planner decided that a seqscan on the results would be the most
efficienthere - and it would indeed be fairly quick if there are only 462 rows with this sender. 

In summary, I think your statistics are off. Do you vacuum frequently enough? Autovacuum helps here, but there have
beenlarge improvements to that in later versions. 

Another approach would be an index on (email_sender, email_msg_id) - that would particularly help the second query and
itshouldn't hurt queries on just email_sender much. 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c2325a8286216564294622!



В списке pgsql-general по дате отправления:

Предыдущее
От: Michelle Konzack
Дата:
Сообщение: HA for PostgreSQL (Auth-Server)
Следующее
От: tv@fuzzy.cz
Дата:
Сообщение: Re: Bad query performance with more conditions?