Re: Query tuning help

Поиск
Список
Период
Сортировка
От Russell Smith
Тема Re: Query tuning help
Дата
Msg-id 200505091051.14689.mr-russ@pws.com.au
обсуждение исходный текст
Ответ на Query tuning help  (Dan Harris <fbsd@drivefaster.net>)
Ответы Re: Query tuning help  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Query tuning help  (Dan Harris <fbsd@drivefaster.net>)
Список pgsql-performance
On Mon, 9 May 2005 09:20 am, Dan Harris wrote:
> Sorry to bother everyone with yet another "my query isn't using an
> index" problem but I am over my head on this one..  I am open to ways
> of restructuring this query to perform better.
>
> I have a table, 'ea', with 22 million rows in it.  VACUUM ANALYZE has
> been just run on the table.
>
> This is the result of:
>
> explain analyze
> select distinct
>  em.incidentid,
>  ea.recordtext as retdata,
>  eg.long,
>  eg.lat
> from
>  ea, em, eg
> where
>  em.incidentid = ea.incidentid and
>  em.incidentid = eg.incidentid and
>  em.entrydate >= '2005-1-1 00:00' and
>  em.entrydate <= '2005-5-9 00:00'
>  and ea.incidentid in (
>   select
>    incidentid
>   from
>    ea
>   where
>    recordtext like '%RED%'
>  )
>
>  and ea.incidentid in (
>   select
>    incidentid
>   from
>    ea
>   where
>    recordtext like '%CORVETTE%'
>  )
>  and (  recordtext like '%RED%' or recordtext like '%CORVETTE%'  )
> order by em.entrydate
>
You cannot use an index for %CORVETTE%, or %RED%.  There is no way
for the index to know if a row had that in the middle without scanning the whole
index.  So it's much cheaper to do a sequence scan.

One possible way to make the query faster is to limit based on date, as you will only get about 700 rows.
And then don't use subselects, as they are doing full sequence scans.  I think this query does what you do
above, and I think it will be faster, but I don't know.

select distinct em.incidentid, ea.recordtext as retdata, eg.long,  eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >= '2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%' AND ea.recordtext like '%CORVETTE%')
JOIN eg ON em.incidentid = eg.incidentid WHERE (recordtext like '%RED%' or recordtext like '%CORVETTE%'  );

>
> ---------------------
> ANALYZE RESULTS
> ---------------------
>
>   Unique  (cost=774693.72..774693.76 rows=1 width=159) (actual time=446787.056..446787.342 rows=72 loops=1)
>     ->  Sort  (cost=774693.72..774693.72 rows=1 width=159) (actual time=446787.053..446787.075 rows=72 loops=1)
>           Sort Key: em.incidentid, public.ea.recordtext, eg.long, eg.lat
>           ->  Nested Loop  (cost=771835.10..774693.71 rows=1 width=159) (actual time=444378.655..446786.746 rows=72
loops=1)
>                 ->  Nested Loop  (cost=771835.10..774688.81 rows=1 width=148) (actual time=444378.532..446768.381
rows=72loops=1) 
>                       ->  Nested Loop IN Join  (cost=771835.10..774678.88 rows=2 width=81) (actual
time=444367.080..446191.864rows=701 loops=1) 
>                             ->  Nested Loop  (cost=771835.10..774572.05 rows=42 width=64) (actual
time=444366.859..445463.232rows=1011 loops=1) 
>                                   ->  HashAggregate  (cost=771835.10..771835.10 rows=1 width=17) (actual
time=444366.702..444368.583rows=473 loops=1) 
>                                         ->  Seq Scan on ea  (cost=0.00..771834.26 rows=335 width=17) (actual
time=259.746..444358.837rows=592 loops=1) 
>                                               Filter: ((recordtext)::text ~~ '%CORVETTE%'::text)
>                                   ->  Index Scan using ea1 on ea  (cost=0.00..2736.43 rows=42 width=47) (actual
time=2.085..2.309rows=2 loops=473) 
>                                         Index Cond: ((ea.incidentid)::text = ("outer".incidentid)::text)
>                                         Filter: (((recordtext)::text ~~ '%RED%'::text) OR ((recordtext)::text ~~
'%CORVETTE%'::text))
>                             ->  Index Scan using ea1 on ea  (cost=0.00..2733.81 rows=42 width=17) (actual
time=0.703..0.703rows=1 loops=1011) 
>                                   Index Cond: (("outer".incidentid)::text = (ea.incidentid)::text)
>                                   Filter: ((recordtext)::text ~~ '%RED%'::text)
>                       ->  Index Scan using em_incidentid_idx on em  (cost=0.00..4.95 rows=1 width=67) (actual
time=0.820..0.821rows=0 loops=701) 
>                             Index Cond: (("outer".incidentid)::text = (em.incidentid)::text)
>                             Filter: ((entrydate >= '2005-01-01 00:00:00'::timestamp without time zone) AND (entrydate
<='2005-05-09 00:00:00'::timestamp without time zone)) 
>                 ->  Index Scan using eg_incidentid_idx on eg  (cost=0.00..4.89 rows=1 width=79) (actual
time=0.245..0.246rows=1 loops=72) 
>                       Index Cond: (("outer".incidentid)::text = (eg.incidentid)::text)
>   Total runtime: 446871.880 ms
> (22 rows)
>
>
> -------------------------
> EXPLANATION
> -------------------------
> The reason for the redundant LIKE clause is that first, I only want
> those "incidentid"s that contain the words 'RED' and 'CORVETTE'.  BUT,
> those two words may exist across multiple records with the same
> incidentid.  Then, I only want to actually work with the rows that
> contain one of the words.  This query will repeat the same logic for
> however many keywords are entered by the user.  I have investigated
> text searching options and have not found them to be congruous with my
> application.
>
> Why is it choosing a sequential scan one part of the query when
> searching for the words, yet using an index scan for another part of
> it?  Is there a better way to structure the query to give it better
> hints?
>
> I'm using 8.0.1 on a 4-way Opteron with beefy RAID-10 and 12GB of RAM.
>
> Thank you for any advice.
>
> -Dan
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>
>

В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Query tuning help
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Query tuning help