Re: Slow query with sub-select

Поиск
Список
Период
Сортировка
От - -
Тема Re: Slow query with sub-select
Дата
Msg-id BAY147-W31DD8B83611D67F05C8FF2F0480@phx.gbl
обсуждение исходный текст
Ответ на Re: Slow query with sub-select  (David Johnston <polobo@yahoo.com>)
Ответы Re: Slow query with sub-select  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Slow query with sub-select  (Michael Nolan <htfoot@gmail.com>)
Список pgsql-general


On Jul 16, 2011, at 6:32, - - <loh.law@hotmail.com> wrote:

The following query seems to take ages despite the EXPLAIN stating that an index is used.
Also, the condition (WHERE t.mid = q.mid) should be a one-to-one mapping, should it not? In this case the mapping is to 3641527 rows.

Table q has no indexes and not referenced by other tables.  
Table t has an index on column mid.

Does anyone know why the query is slow?


SELECT COUNT(*) FROM q
      WHERE NOT EXISTS (SELECT 1
                          FROM t AS t
                         WHERE t.mid = q.mid);


                                              QUERY PLAN                                                
< font class="Apple-style-span" face="Tahoma" size="2">---------------------------------------------------------------------------------------------------------
 Aggregate  (cost=10021304028.93..10021304028.94 rows=1 width=0)
   ->  Hash Anti Join  (cost=10007145480.54..10021028896.24 rows=110053073 width=0)
         Hash Cond: ((q.mid)::text = (t.mid)::text)
         ->  Seq Scan on q (cost=10000000000.00..10007993328.46 rows=220106146 width=38)
         ->  Hash  (cost=7083958.46..7083958. 46 rows=364 1527 width=10)
               ->  Index Scan using t_pkey on t  (cost=0.00..7083958.46 rows=3641527 width=10)
(6 rows)

1. Indexes are not magical; their usage does not guarantee a fast query
2. It is slow because you have no non-join where condition and around 225 MILLION rows that need to be evaluated.
3. Also, you are using a correlated sub-query instead of a LEFT OUTER JOIN
4. You haven't provided table definitions with indexes and so whether q.mid=t.mid is a 1-to-1 optional relationship is unknowable.  Hell, since the names are meaningless we cannot even guess what kind of relationship the tables should have.  The generic "mid" field name has the same problem.

David J.



The weird thing is that before I updated my server the query was about 5 times faster.

I've googled and I think the problem lies with the under-estimation of the query planner about the number of rows in the nested table.
I will be trying the 'set enable_seqscan = false' solution to see if that'll improve.

В списке pgsql-general по дате отправления:

Предыдущее
От: Rob Sargent
Дата:
Сообщение: Re: Table dublicates values
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Slow query with sub-select