Re: [HACKERS] Costs: Index vs Non-Index

Поиск
Список
Период
Сортировка
От The Hermit Hacker
Тема Re: [HACKERS] Costs: Index vs Non-Index
Дата
Msg-id Pine.BSF.4.21.0001081457420.18498-100000@thelab.hub.org
обсуждение исходный текст
Ответ на Re: [HACKERS] Costs: Index vs Non-Index  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Okay, I had remembered to VACUUM, but I always forget to VACUUM ANALYZE :(
results come out much better now:

udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE:  QUERY PLAN:

Seq Scan on url  (cost=3368.58 rows=12623 width=4)

EXPLAIN
udmsearch=> select (next_index_time) from url where next_index_time <= 947317073;                        
next_index_time
---------------
(0 rows)

udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
CREATE
udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE:  QUERY PLAN:

Index Scan using url_next_index_time on url  (cost=1364.10 rows=12623 width=4)

EXPLAIN



On Sat, 8 Jan 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > Hrmmm...if I'm reading this right, its more costly to create an index then
> > to leave it as a sequential scan, but it returns more rows?  Yet, it
> > returns, if I do the query with a count() around the return value, 288
> > rows, not 334 or 1154...
> 
> This doesn't have anything to do with index vs sequential scan, but it
> does have to do with whether you've done a VACUUM ANALYZE lately.
> You haven't ;-)
> 
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE:  QUERY PLAN:
> > Seq Scan on url  (cost=43.00 rows=334 width=4)
> 
> IIRC, rows=334 is the default estimate of result rows you will get for
> this query in the absence of any information whatever.  (Default table
> size guess is 1000 rows, and default selectivity guess for <= is 1/3,
> so...)  If you have not vacuumed, it's sheer coincidence that this is
> even within hailing distance of the correct figure of 288.
> 
> > udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
> > CREATE
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE:  QUERY PLAN:
> > Index Scan using url_next_index_time on url  (cost=271.68 rows=1154 width=4)
> 
> I believe that a side-effect of CREATE INDEX is to update the
> number-of-pages-and-rows statistics in pg_class for the target table.
> So after you do that, the optimizer has a correct idea of the table's
> size, but still no more info about the selectivity of the WHERE clause.
> (I infer that your table has size 1154*3 rows.)  If you now drop the
> index and repeat EXPLAIN, it'll go back to a seq scan, but it will now
> say 1154 rows --- and the cost estimate will be higher, too.
> 
> If you do VACUUM ANALYZE, then the optimizer will also know the min and
> max values of next_index_time, and will have some shot at making a
> correct estimate of the output row count.  I'd be interested to know
> what it predicts then...
> 
>             regards, tom lane
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ed Loehr
Дата:
Сообщение: Re: [HACKERS] Re: ERROR: out of free buffers: time to abort !
Следующее
От: The Hermit Hacker
Дата:
Сообщение: Re: [HACKERS] Another index "buglet"?