Re: query plan wierdness?

Поиск
Список
Период
Сортировка
От Joel McGraw
Тема Re: query plan wierdness?
Дата
Msg-id 7B3E33EF2A10A84185E3667F6B9A1B781A0683@ECIEXCHANGE.eldocomp.com
обсуждение исходный текст
Ответ на query plan wierdness?  (Joel McGraw <jmcgraw@eldocomp.com>)
Ответы Re: query plan wierdness?  (Rod Taylor <pg@rbt.ca>)
Список pgsql-performance
>
> > However, this query performs a sequence scan on the table, ignoring
the
> > call_idx13 index (the only difference is the addition of the aspid
field
> > in the order by clause):
>
> You do not have an index which matches the ORDER BY, so PostgreSQL
> cannot simply scan the index for the data you want. Thus is needs to
> find all matching rows, order them, etc.
>
> > 23:59:59.999' order by aspid, openeddatetime desc, callstatus desc,
> > calltype desc, callkey desc limit 26;
>
> aspid ASC, openeddatetime DESC, callstatus DESC, calltype DESC
>
> >          call_idx13 btree (aspid, openeddatetime, callstatus,
calltype,
> > callkey),
>
> This index is: aspid ASC, openeddatetime ASC, callstatus ASC, calltype
> ASC, callkey ASC
>

OK, that makes sense; however, this doesn't:

elon2=# explain analyse select * from call where aspid='123C' and
OpenedDateTime between '2000-01-01 00:00:00.0' and '2004-06-24
23:59:59.999' order by aspid asc, openeddatetime asc, callstatus asc,
calltype asc, callkey asc;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------
 Sort  (cost=342903.52..344071.99 rows=467384 width=295) (actual
time=33159.38..33897.22 rows=461973 loops=1)
   Sort Key: aspid, openeddatetime, callstatus, calltype, callkey
   ->  Seq Scan on call  (cost=0.00..31019.36 rows=467384 width=295)
(actual time=1.80..7373.75 rows=461973 loops=1)
         Filter: ((aspid = '123C'::bpchar) AND (openeddatetime >=
'2000-01-01 00:00:00-07'::timestamp with time zone) AND (openeddatetime
<= '2004-06-24 23:59:59.999-07'::timestamp with time zone))
 Total runtime: 38043.03 msec
(5 rows)


I've modified the "order by" to reflect the call_idx13 index, yet the
query still causes a sequence scan of the table.



> A reverse scan, would of course be DESC, DESC, DESC, DESC, DESC --
> neither of which matches your requested order by, thus cannot help the
> reduce the lines looked at to 26.


To clarify, the query that the programmer wants is:

select * from call where aspid='123C' and OpenedDateTime between
'2000-01-01 00:00:00.0' and '2004-06-24 23:59:59.999' order by aspid,
openeddatetime desc, callstatus desc, calltype desc, callkey desc;

We had started playing with placing limits on the query to address
another, unrelated problem.

However, out of curiosity I did some testing with varying limits to see
at which point the planner decided to do a sequence scan instead of
using the index.



elon2=# explain analyse select * from call where aspid='123C' and
OpenedDateTime between '2000-01-01 00:00:00.0' and '2004-06-24
23:59:59.999' order by aspid, openeddatetime , callstatus , calltype ,
callkey limit 92785;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------------------------
 Limit  (cost=0.00..343130.36 rows=92785 width=295) (actual
time=0.17..1835.55 rows=92785 loops=1)
   ->  Index Scan using call_idx13 on call  (cost=0.00..1728444.76
rows=467384 width=295) (actual time=0.17..1699.56 rows=92786 loops=1)
         Index Cond: ((aspid = '123C'::bpchar) AND (openeddatetime >=
'2000-01-01 00:00:00-07'::timestamp with time zone) AND (openeddatetime
<= '2004-06-24 23:59:59.999-07'::timestamp with time zone))
 Total runtime: 1901.43 msec
(4 rows)



elon2=# explain analyse select * from call where aspid='123C' and
OpenedDateTime between '2000-01-01 00:00:00.0' and '2004-06-24
23:59:59.999' order by aspid, openeddatetime , callstatus , calltype ,
callkey limit 92786;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------------------------
 Limit  (cost=0.00..343134.06 rows=92786 width=295) (actual
time=0.17..1834.09 rows=92786 loops=1)
   ->  Index Scan using call_idx13 on call  (cost=0.00..1728444.76
rows=467384 width=295) (actual time=0.17..1698.16 rows=92787 loops=1)
         Index Cond: ((aspid = '123C'::bpchar) AND (openeddatetime >=
'2000-01-01 00:00:00-07'::timestamp with time zone) AND (openeddatetime
<= '2004-06-24 23:59:59.999-07'::timestamp with time zone))
 Total runtime: 1899.97 msec
(4 rows)


elon2=# select count(*) from call;
 count
--------
 507392
(1 row)


>
> This leaves your WHERE clause to restrict the dataset and it doesn't
do
> a very good job of it. There are more than 450000 rows matching the
> where clause, which means the sequential scan was probably the right
> choice (unless you have over 10 million entries in the table).
>
>
> Since your WHERE clause contains a single aspid, an improvement to the
> PostgreSQL optimizer may be to ignore that field in the ORDER BY as
> order is no longer important since there is only one possible value.
If
> it did ignore aspid, it would use a plan similar to the first one you
> provided.
>
> You can accomplish the same thing by leaving out aspid ASC OR by
setting
> it to aspid DESC in the ORDER BY. Leaving it out entirely will be
> slightly faster, but DESC will cause PostgreSQL to use index
> "call_idx13".
>
>

Again, that makes sense to me, but if I remove aspid from the query it
still ignores the index....


elon2=# explain analyse select * from call where aspid='123C' and
OpenedDateTime between '2000-01-01 00:00:00.0' and '2004-06-24
23:59:59.999' order by openeddatetime desc, callstatus desc, calltype
desc, callkey desc;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------
 Sort  (cost=342903.52..344071.99 rows=467384 width=295) (actual
time=17598.31..18304.26 rows=461973 loops=1)
   Sort Key: openeddatetime, callstatus, calltype, callkey
   ->  Seq Scan on call  (cost=0.00..31019.36 rows=467384 width=295)
(actual time=1.78..7337.85 rows=461973 loops=1)
         Filter: ((aspid = '123C'::bpchar) AND (openeddatetime >=
'2000-01-01 00:00:00-07'::timestamp with time zone) AND (openeddatetime
<= '2004-06-24 23:59:59.999-07'::timestamp with time zone))
 Total runtime: 21665.43 msec
(5 rows)


Setting enable_seqscan=off still doesn't cause the desired index to be
selected:

elon2=# explain analyse select * from call where aspid='123C' and
OpenedDateTime between '2000-01-01 00:00:00.0' and '2004-06-24
23:59:59.999' order by aspid desc, openeddatetime desc, callstatus desc,
calltype desc, callkey desc;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
-------------------------
 Sort  (cost=355314.41..356482.87 rows=467384 width=295) (actual
time=33382.92..34088.10 rows=461973 loops=1)
   Sort Key: aspid, openeddatetime, callstatus, calltype, callkey
   ->  Index Scan using call_aspid on call  (cost=0.00..43430.25
rows=467384 width=295) (actual time=0.24..7915.21 rows=461973 loops=1)
         Index Cond: (aspid = '123C'::bpchar)
         Filter: ((openeddatetime >= '2000-01-01 00:00:00-07'::timestamp
with time zone) AND (openeddatetime <= '2004-06-24
23:59:59.999-07'::timestamp with time zone))
 Total runtime: 39196.39 msec




Thanks for your help (and sorry for the long post),

-Joel

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain
informationthat is privileged, confidential and exempt from disclosure under applicable law. If you are not the
intendedaddressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use,
copy,disclose or distribute to anyone the message or any information contained in the message. If you have received
thismessage in error, please immediately advise the sender by reply email, and delete the message. Thank you. 

В списке pgsql-performance по дате отправления:

Предыдущее
От: Bill Chandler
Дата:
Сообщение: Re: Terrible performance after deleting/recreating indexes
Следующее
От: Rod Taylor
Дата:
Сообщение: Re: query plan wierdness?