Обсуждение: Odd explain estimate

Поиск
Список
Период
Сортировка

Odd explain estimate

От
"Jim C. Nasby"
Дата:
Why is pgsql estimating a cost of 100000000 for retire_today in this
query? I analyzed it, and there's nothing very odd about it, other than
it's a temp table.

BTW, I had to set enable_seqscan=false to get this, otherwise it wants
to seqscan ogr_results, which is rather painful since it occupies 350k
pages.

ogr=# explain analyze select distinct stub_id, nodecount, id from (select distinct stub_id, nodecount, o.id, r.stats_id
fromretire_today r, ogr_results o where o.id=r.id) o where exists (select * from ogr_results o2 where
o2.stub_id=o.stub_idand o2.nodecount=o.nodecount and o2.id=o.stats_id); 
                                                                                   QUERY PLAN
                                                        

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=101349702.99..101350940.01 rows=12370 width=24) (actual time=422568.80..422568.82 rows=1 loops=1)
   ->  Sort  (cost=101349702.99..101350012.25 rows=123702 width=24) (actual time=422568.79..422568.80 rows=1 loops=1)
         Sort Key: stub_id, nodecount, id
         ->  Subquery Scan o  (cost=101323777.30..101339240.00 rows=123702 width=24) (actual time=388142.51..422568.59
rows=1loops=1) 
               Filter: (subplan)
               ->  Unique  (cost=101323777.30..101339240.00 rows=123702 width=24) (actual time=12456.49..13570.23
rows=56546loops=1) 
                     ->  Sort  (cost=101323777.30..101326869.84 rows=1237016 width=24) (actual time=12456.47..12758.86
rows=56546loops=1) 
                           Sort Key: o.stub_id, o.nodecount, o.id, r.stats_id
                           ->  Nested Loop  (cost=100000000.00..101198600.98 rows=1237016 width=24) (actual
time=93.57..11747.10rows=56546 loops=1) 
                                 ->  Seq Scan on retire_today r  (cost=100000000.00..100000001.93 rows=93 width=8)
(actualtime=0.03..1.78 rows=93 loops=1) 
                                 ->  Index Scan using ogr_results__id on ogr_results o  (cost=0.00..12721.90 rows=13301
width=16)(actual time=18.03..118.43 rows=608 loops=93) 
                                       Index Cond: (o.id = "outer".id)
               SubPlan
                 ->  Index Scan using results_id_count on ogr_results o2  (cost=0.00..3.03 rows=1 width=24) (actual
time=7.21..7.21rows=0 loops=56546) 
                       Index Cond: ((stub_id = $0) AND (nodecount = $1))
                       Filter: (id = $2)
 Total runtime: 422591.48 msec
(17 rows)

        Table "pg_temp_2.retire_today"
  Column  |         Type          | Modifiers
----------+-----------------------+-----------
 email    | character varying(64) | not null
 id       | integer               | not null
 stats_id | integer               | not null

ogr=# select * from pg_stats where tablename='retire_today';
 schemaname |  tablename   | attname  | null_frac | avg_width | n_distinct |           most_common_vals           |
            most_common_freqs                  |
                                  histogram_bounds
                                    | correlation  

------------+--------------+----------+-----------+-----------+------------+--------------------------------------+-----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
 pg_temp_1  | retire_today | email    |         0 |        23 |         -1 |                                      |
                                               |
{75521.3541@compuserve.com,arlehma@01019freenet.de,dallas.morlock@worldnet.att.net,hugo@mcs.net,jmk_yoko@yahoo.com.au,lenchik@severodvinsk.ru,nilrem@o2.pl,rc55@pochta.ru,seph429@earthlink.net,vitte@frontbone.de,zotxxx@xs4all.nl}
|   0.894781 
 pg_temp_1  | retire_today | id       |         0 |         4 |         -1 |                                      |
                                               |
{17193,153860,220570,315863,351077,382582,405976,413303,418589,423335,424575}
                                                                                                           |
0.17536
 pg_temp_1  | retire_today | stats_id |         0 |         4 |  -0.946237 | {142167,391154,402835,422577,423809} |
{0.0215054,0.0215054,0.0215054,0.0215054,0.0215054}|
{136669,373730,415341,421924,423416,423553,423959,424089,424354,424609,424976}
                                                                                                           |
-0.132419
 pg_temp_2  | retire_today | email    |         0 |        23 |         -1 |                                      |
                                               |
{75521.3541@compuserve.com,arlehma@01019freenet.de,dallas.morlock@worldnet.att.net,hugo@mcs.net,jmk_yoko@yahoo.com.au,lenchik@severodvinsk.ru,nilrem@o2.pl,rc55@pochta.ru,seph429@earthlink.net,vitte@frontbone.de,zotxxx@xs4all.nl}
|   0.894781 
 pg_temp_2  | retire_today | id       |         0 |         4 |         -1 |                                      |
                                               |
{17193,153860,220570,315863,351077,382582,405976,413303,418589,423335,424575}
                                                                                                           |
0.17536
 pg_temp_2  | retire_today | stats_id |         0 |         4 |  -0.946237 | {142167,391154,402835,422577,423809} |
{0.0215054,0.0215054,0.0215054,0.0215054,0.0215054}|
{136669,373730,415341,421924,423416,423553,423959,424089,424354,424609,424976}
                                                                                                           |
-0.132419
(6 rows)

ogr=# select * from pg_class where relname='retire_today';
   relname    | relnamespace |  reltype  | relowner | relam | relfilenode | relpages | reltuples | reltoastrelid |
reltoastidxid| relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs
|relhasoids | relhaspkey | relhasrules | relhassubclass | relacl  

--------------+--------------+-----------+----------+-------+-------------+----------+-----------+---------------+---------------+-------------+-------------+---------+----------+-----------+-------------+----------+----------+---------+------------+------------+-------------+----------------+--------
 retire_today |        16765 | 636609103 |      101 |     0 |   636609102 |        1 |        93 |             0 |
      0 | f           | f           | r       |        3 |         0 |           0 |        0 |        0 |       0 | f
       | f          | f           | f              |  
 retire_today |    411964549 | 636609142 |      110 |     0 |   636609141 |        1 |        93 |             0 |
      0 | f           | f           | r       |        3 |         0 |           0 |        0 |        0 |       0 | f
       | f          | f           | f              |  
 retire_today |    478929703 | 632973603 |      101 |     0 |   632973602 |        0 |         0 |             0 |
      0 | f           | f           | r       |        3 |         0 |           0 |        0 |        0 |       0 | f
       | f          | f           | f              |  
(3 rows)
--
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Odd explain estimate

От
Andrew Sullivan
Дата:
On Thu, Jul 31, 2003 at 02:51:45PM -0500, Jim C. Nasby wrote:
> Why is pgsql estimating a cost of 100000000 for retire_today in this
> query? I analyzed it, and there's nothing very odd about it, other than
> it's a temp table.
>
> BTW, I had to set enable_seqscan=false to get this, otherwise it wants

That's why.  When you do that, it just automatically adds 100000000
to the cost of a seqscan.  It can't really disable it, because there
might be no other way to pull the result.

If you really needed to set enable_seqscan=false (did you really?
Are you sure that's not the cheapest way?), you might want to
investigate expainding the statistics on the indexed column,
increasing the correlation through clustering, and other such tricks.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Odd explain estimate

От
"Jim C. Nasby"
Дата:
On Thu, Jul 31, 2003 at 04:59:21PM -0400, Andrew Sullivan wrote:
> On Thu, Jul 31, 2003 at 02:51:45PM -0500, Jim C. Nasby wrote:
> If you really needed to set enable_seqscan=false (did you really?
> Are you sure that's not the cheapest way?), you might want to
> investigate expainding the statistics on the indexed column,
> increasing the correlation through clustering, and other such tricks.

Well, if I don't do this it wants to seqscan a table that occupies 350k
pages, instead of pulling a couple thousand rows. I started running it
with the seqscan and it's already taken way longer than it does if I
disable seqscan.

I guess I'll try expanding the statistics.
--
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Odd explain estimate

От
Andrew Sullivan
Дата:
On Thu, Jul 31, 2003 at 05:59:59PM -0500, Jim C. Nasby wrote:
>
> Well, if I don't do this it wants to seqscan a table that occupies 350k
> pages, instead of pulling a couple thousand rows. I started running it
> with the seqscan and it's already taken way longer than it does if I
> disable seqscan.

That was indeed the question.

If it uses a seqscan when it ought not to do, then there's something
wrong with the statistics, or you haven't vacuum analysed correctly,
or your table needs vacuum full (is it really 350k pages, or is that
mostly dead space?), &c. -- all the usual bad-seqscan candidates.

enable_seqscan=off is probably not a good strategy for any moderately
complicated query.  If the planner were perfect, of course, you'd
never need it at all.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Odd explain estimate

От
"Jim C. Nasby"
Дата:
On Fri, Aug 01, 2003 at 08:16:12AM -0400, Andrew Sullivan wrote:
> On Thu, Jul 31, 2003 at 05:59:59PM -0500, Jim C. Nasby wrote:
> >
> > Well, if I don't do this it wants to seqscan a table that occupies 350k
> > pages, instead of pulling a couple thousand rows. I started running it
> > with the seqscan and it's already taken way longer than it does if I
> > disable seqscan.
>
> That was indeed the question.
>
> If it uses a seqscan when it ought not to do, then there's something
> wrong with the statistics, or you haven't vacuum analysed correctly,
> or your table needs vacuum full (is it really 350k pages, or is that
> mostly dead space?), &c. -- all the usual bad-seqscan candidates.
>
> enable_seqscan=off is probably not a good strategy for any moderately
> complicated query.  If the planner were perfect, of course, you'd
> never need it at all.

Set statistics on the ID colum to 1000, vacuum analyze, and it's good to
go now. Thanks for your help!
--
Jim C. Nasby, Database Consultant                  jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"