Re: postgres 8.2 seems to prefer Seq Scan

Поиск
Список
Период
Сортировка
От Alex Deucher
Тема Re: postgres 8.2 seems to prefer Seq Scan
Дата
Msg-id a728f9f90704091143t58141a32mb0de1713499da4f8@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postgres 8.2 seems to prefer Seq Scan  (Michael Fuhr <mike@fuhr.org>)
Ответы Re: postgres 8.2 seems to prefer Seq Scan
Список pgsql-performance
On 4/6/07, Michael Fuhr <mike@fuhr.org> wrote:
> On Fri, Apr 06, 2007 at 04:38:33PM -0400, Alex Deucher wrote:
> > One more anomaly between 7.4 and 8.2.  DB dumped from 7.4 and loaded
> > onto 8.2, both have locale set to C.  8.2 seems to prefer Seq Scans
> > for the first query while the ordering in the second query seems to
> > perform worse on 8.2.  I ran analyze.  I've tried with the encoding
> > set to UTF-8 and SQL_ASCII; same numbers and plans.  Any ideas how to
> > improve this?
>
> Are you sure the data sets are identical?  The 7.4 query returned
> 0 rows; the 8.2 query returned 1 row.  If you're running the same
> query against the same data in both versions then at least one of
> them appears to be returning the wrong result.  Exactly which
> versions of 7.4 and 8.2 are you running?
>
> Have you analyzed all tables in both versions?  The row count
> estimate in 7.4 is much closer to reality than in 8.2:
>
> 7.4
> >   ->  Index Scan using pnum_idx on event  (cost=0.00..3.37 rows=19
> > width=172) (actual time=0.063..0.063 rows=0 loops=1)
> >         Index Cond: ((pnum)::text = 'AB5819188'::text)
>
> 8.2
> >   ->  Index Scan using pnum_idx on event  (cost=0.00..3147.63
> > rows=1779 width=171) (actual time=0.030..0.033 rows=1 loops=1)
> >         Index Cond: ((pnum)::text = 'AB5819188'::text)
>
> If analyzing the event table doesn't improve the row count estimate
> then try increasing the statistics target for event.pnum and analyzing
> again.  Example:
>
> ALTER TABLE event ALTER pnum SET STATISTICS 100;
> ANALYZE event;
>
> You can set the statistics target as high as 1000 to get more
> accurate results at the cost of longer ANALYZE times.
>

Setting statistics to 400 seems to be the sweet spot.  Values above
that seem to only marginally improve performance.  However, I have to
disable seqscan in order for the query to be fast.  Why does the query
planner insist on doing a seq scan?  Is there anyway to make it prefer
the index scan?

Thanks,

Alex

postgres 8.2

db=# EXPLAIN ANALYZE select pnum, event_pid, code_name,
code_description, code_mcam, event_date, effective_date, ref_country,
ref_country_legal_code, corresponding_pnum, withdrawal_date,
payment_date, extension_date, fee_payment_year, requester, free_form
from code inner join event on code_pid = code_pid_fk where pnum
='US5819188';
                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=106.91..210.37 rows=54 width=229) (actual
time=11.245..11.253 rows=1 loops=1)
   Hash Cond: (event.code_pid_fk = code.code_pid)
   ->  Index Scan using pnum_idx on event  (cost=0.00..102.58 rows=54
width=170) (actual time=0.108..0.112 rows=1 loops=1)
         Index Cond: ((pnum)::text = 'US5819188'::text)
   ->  Hash  (cost=70.85..70.85 rows=2885 width=67) (actual
time=11.006..11.006 rows=2885 loops=1)
         ->  Seq Scan on code  (cost=0.00..70.85 rows=2885 width=67)
(actual time=0.025..5.392 rows=2885 loops=1)
 Total runtime: 11.429 ms
(7 rows)

db=# set enable_seqscan=0;
SET
db=# EXPLAIN ANALYZE select pnum, event_pid, code_name,
code_description, code_mcam, event_date, effective_date, ref_country,
ref_country_legal_code, corresponding_pnum, withdrawal_date,
payment_date, extension_date, fee_payment_year, requester, free_form
from code inner join event on code_pid = code_pid_fk where pnum
='US5819188';
                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..289.72 rows=54 width=229) (actual
time=0.068..0.076 rows=1 loops=1)
   ->  Index Scan using pnum_idx on event  (cost=0.00..102.58 rows=54
width=170) (actual time=0.019..0.020 rows=1 loops=1)
         Index Cond: ((pnum)::text = 'US5819188'::text)
   ->  Index Scan using code_pkey on code  (cost=0.00..3.45 rows=1
width=67) (actual time=0.041..0.043 rows=1 loops=1)
         Index Cond: (code.code_pid = event.code_pid_fk)
 Total runtime: 0.126 ms
(6 rows)

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Beginner Question
Следующее
От: "Merlin Moncure"
Дата:
Сообщение: join to view over custom aggregate seems like it should be faster