Different Choices For Index/Sequential Scan With And Without A Join In 7.2

Поиск
Список
Период
Сортировка
От Mark kirkwood
Тема Different Choices For Index/Sequential Scan With And Without A Join In 7.2
Дата
Msg-id 01082620361000.01707@spikey.slithery.org
обсуждение исходный текст
Ответы Re: Different Choices For Index/Sequential Scan With And Without A Join In 7.2
Список pgsql-sql
Dear List,

I have been doing a little investigation on when the optimizer chooses a 
sequential scan over an index access. I have come accross what interesting 
behaviour in the current 7.2 sources ( 2001-08-17):

The consider two types of query on my "usual" tables :

SELECT       f.d0key,      count(f.val)
FROM fact0 f
WHERE f.d0key BETWEEN  270 AND <integer>
GROUP BY f.d0key;

and

SELECT      d0.f1,      count(f.val)
FROM dim0 d0,    fact0 f
WHERE d0.d0key = f.d0key
AND   d0.f1 BETWEEN  '2000-01-26' AND <'date'>
GROUP BY d0.f1;


Note that  'f1' = '2000-01-26'  corrosponds to 'd0key' = 270 in the table 
'dim0';

I wanted to find the values for <integer> and <date> for which the optimizer 
changed from and index acess to a seq scan of the 'fact0' table.

I used cpu_tuple_cost = 0.4, but everything else was fairly standard.

For the first query the value of <integer> ( i.e : 'd0key' ) was 627
For the second the value of <date> (i.e 'f1' ) was '2000-02-05'  ( 
corrosponds to d0key = 279 ) 

It guess I was expecting the value that made the first query change from 
index to seq scan to be "close" to the value that made the second query use a 
sequential scan....as the fact0 access of the second query is essentially the 
first query. However the results are vastly different - have I missed 
something obvious here ?


The script and explain output are listed below.

regards

Mark

<--script
--------------------------------------------------------------------
SET cpu_tuple_cost=0.4;
SHOW cpu_tuple_cost;

--  show what keys are for what dates...
--
SELECT d0.d0key,      d0.f1
FROM dim0 d0
WHERE d0.d0key IN ('270','279','280','626','627')
;


--  show when index scans change to sequential
--  for the fact0 table alone...
--
EXPLAIN
SELECT       f.d0key,      count(f.val) 
FROM fact0 f
WHERE f.d0key BETWEEN  270 AND 626
GROUP BY f.d0key
;


EXPLAIN
SELECT       f.d0key,      count(f.val)
FROM fact0 f
WHERE f.d0key BETWEEN  270 AND 627
GROUP BY f.d0key
;


--  show when index scans change to sequential
--  for the two table join
--EXPLAIN
SELECT      d0.f1,      count(f.val)
FROM dim0 d0,    fact0 f
WHERE d0.d0key = f.d0key
AND   d0.f1 BETWEEN  '2000-01-26' AND '2000-02-04'
GROUP BY d0.f1
;


EXPLAIN
SELECT      d0.f1,      count(f.val)
FROM dim0 d0,    fact0 f
WHERE d0.d0key = f.d0key
AND   d0.f1 BETWEEN  '2000-01-26' AND '2000-02-05'
GROUP BY d0.f1
;

<--results
--------------------------------------------------------------------
SET VARIABLE
NOTICE:  cpu_tuple_cost is 0.4
SHOW VARIABLEd0key |           f1
-------+------------------------  270 | 2000-01-26 00:00:00+13  279 | 2000-02-04 00:00:00+13  280 | 2000-02-05
00:00:00+13 626 | 2001-01-16 00:00:00+13  627 | 2001-01-17 00:00:00+13
 
(5 rows)      
NOTICE:  QUERY PLAN:

Aggregate  (cost=0.00..1308177.10 rows=33453 width=8) ->  Group  (cost=0.00..1307340.77 rows=334533 width=8)       ->
IndexScan using fact0_pk on fact0 f  (cost=0.00..1306504.44 
 
rows=334533 width=8)

EXPLAIN
NOTICE:  QUERY PLAN:

Aggregate  (cost=1308030.21..1309707.21 rows=33540 width=8) ->  Group  (cost=1308030.21..1308868.71 rows=335400
width=8)      ->  Sort  (cost=1308030.21..1308030.21 rows=335400 width=8)             ->  Seq Scan on fact0 f
(cost=0.00..1272693.00rows=335400 
 
width=8)

EXPLAIN
NOTICE:  QUERY PLAN:

Aggregate  (cost=0.00..1155870.07 rows=268 width=20) ->  Group  (cost=0.00..1155863.36 rows=2684 width=20)       ->
NestedLoop  (cost=0.00..1155856.65 rows=2684 width=20)             ->  Index Scan using dim0_q1 on dim0 d0
(cost=0.00..6.63
 
rows=9 width=12)             ->  Index Scan using fact0_pk on fact0 f  (cost=0.00..117117.99 
rows=30000 width=8)

EXPLAIN
NOTICE:  QUERY PLAN:

Aggregate  (cost=1281572.52..1281587.43 rows=298 width=20) ->  Group  (cost=1281572.52..1281579.97 rows=2982 width=20)
    ->  Sort  (cost=1281572.52..1281572.52 rows=2982 width=20)             ->  Hash Join  (cost=7.06..1281400.41
rows=2982width=20)                   ->  Seq Scan on fact0 f  (cost=0.00..1257693.00 
 
rows=3000000 width=8)                   ->  Hash  (cost=7.04..7.04 rows=10 width=12)                         ->  Index
Scanusing dim0_q1 on dim0 d0  
 
(cost=0.00..7.04 rows=10 width=12)

EXPLAIN




В списке pgsql-sql по дате отправления:

Предыдущее
От: Christopher Sawtell
Дата:
Сообщение: Re: Execute permsissions on fuctions
Следующее
От: Meszaros Attila
Дата:
Сообщение: "AND", "OR" and Materialize :((((