Обсуждение: Adding an "and is not null" on an indexed field slows the query down immensely.

Поиск
Список
Период
Сортировка

Adding an "and is not null" on an indexed field slows the query down immensely.

От
Tim Uckun
Дата:
I have this query it runs reasonably quickly.


SELECT "consolidated_urls".* FROM "consolidated_urls" INNER JOIN "topical_urls"
 ON "consolidated_urls".id = "topical_urls".consolidated_url_id
 WHERE (("topical_urls".domain_id = 157) AND ((topical_urls.hidden = 'f')))
  ORDER BY index_delta DESC LIMIT 10

The explain for this goes like this.

"Limit  (cost=29899.43..29899.46 rows=10 width=1880)"
"  ->  Sort  (cost=29899.43..29906.04 rows=2642 width=1880)"
"        Sort Key: consolidated_urls.index_delta"
"        ->  Nested Loop  (cost=101.29..29842.34 rows=2642 width=1880)"
"              ->  Bitmap Heap Scan on topical_urls
(cost=101.29..7490.32 rows=2642 width=4)"
"                    Recheck Cond: (domain_id = 157)"
"                    Filter: (NOT hidden)"
"                    ->  Bitmap Index Scan on
index_topical_urls_on_domain_id_and_consolidated_url_id
(cost=0.00..100.63 rows=2643 width=0)"
"                          Index Cond: (domain_id = 157)"
"              ->  Index Scan using consolidated_urls_pkey on
consolidated_urls  (cost=0.00..8.45 rows=1 width=1880)"
"                    Index Cond: (consolidated_urls.id =
topical_urls.consolidated_url_id)"


I add one more clause on to it to filter out index_deltas that are not
null and the query becomes unusably slow.

SELECT "consolidated_urls".* FROM "consolidated_urls" INNER JOIN "topical_urls"
 ON "consolidated_urls".id = "topical_urls".consolidated_url_id
 WHERE (("topical_urls".domain_id = 157) AND ((topical_urls.hidden = 'f')))
 AND (index_delta IS NOT NULL) ORDER BY index_delta DESC LIMIT 10

The explain for this is goes like this

"Limit  (cost=0.00..20555.33 rows=10 width=1880)"
"  ->  Nested Loop  (cost=0.00..5430717.58 rows=2642 width=1880)"
"        ->  Index Scan Backward using
index_consolidateds_url_on_index_delta on consolidated_urls
(cost=0.00..5316175.98 rows=15242 width=1880)"
"              Filter: (index_delta IS NOT NULL)"
"        ->  Index Scan using
index_topical_urls_on_domain_id_and_consolidated_url_id on
topical_urls  (cost=0.00..7.50 rows=1 width=4)"
"              Index Cond: ((topical_urls.domain_id = 157) AND
(topical_urls.consolidated_url_id = consolidated_urls.id))"
"              Filter: (NOT topical_urls.hidden)"



The index_delta field is double precision and is indexed.

Any suggestions as to how to make this query run faster?

Re: Adding an "and is not null" on an indexed field slows the query down immensely.

От
pasman pasmański
Дата:
Try :
order by index_delta+1 desc

On 1/15/11, Tim Uckun <timuckun@gmail.com> wrote:
> I have this query it runs reasonably quickly.
>
>
> SELECT "consolidated_urls".* FROM "consolidated_urls" INNER JOIN
> "topical_urls"
>  ON "consolidated_urls".id = "topical_urls".consolidated_url_id
>  WHERE (("topical_urls".domain_id = 157) AND ((topical_urls.hidden = 'f')))
>   ORDER BY index_delta DESC LIMIT 10
>
> The explain for this goes like this.
>
> "Limit  (cost=29899.43..29899.46 rows=10 width=1880)"
> "  ->  Sort  (cost=29899.43..29906.04 rows=2642 width=1880)"
> "        Sort Key: consolidated_urls.index_delta"
> "        ->  Nested Loop  (cost=101.29..29842.34 rows=2642 width=1880)"
> "              ->  Bitmap Heap Scan on topical_urls
> (cost=101.29..7490.32 rows=2642 width=4)"
> "                    Recheck Cond: (domain_id = 157)"
> "                    Filter: (NOT hidden)"
> "                    ->  Bitmap Index Scan on
> index_topical_urls_on_domain_id_and_consolidated_url_id
> (cost=0.00..100.63 rows=2643 width=0)"
> "                          Index Cond: (domain_id = 157)"
> "              ->  Index Scan using consolidated_urls_pkey on
> consolidated_urls  (cost=0.00..8.45 rows=1 width=1880)"
> "                    Index Cond: (consolidated_urls.id =
> topical_urls.consolidated_url_id)"
>
>
> I add one more clause on to it to filter out index_deltas that are not
> null and the query becomes unusably slow.
>
> SELECT "consolidated_urls".* FROM "consolidated_urls" INNER JOIN
> "topical_urls"
>  ON "consolidated_urls".id = "topical_urls".consolidated_url_id
>  WHERE (("topical_urls".domain_id = 157) AND ((topical_urls.hidden = 'f')))
>  AND (index_delta IS NOT NULL) ORDER BY index_delta DESC LIMIT 10
>
> The explain for this is goes like this
>
> "Limit  (cost=0.00..20555.33 rows=10 width=1880)"
> "  ->  Nested Loop  (cost=0.00..5430717.58 rows=2642 width=1880)"
> "        ->  Index Scan Backward using
> index_consolidateds_url_on_index_delta on consolidated_urls
> (cost=0.00..5316175.98 rows=15242 width=1880)"
> "              Filter: (index_delta IS NOT NULL)"
> "        ->  Index Scan using
> index_topical_urls_on_domain_id_and_consolidated_url_id on
> topical_urls  (cost=0.00..7.50 rows=1 width=4)"
> "              Index Cond: ((topical_urls.domain_id = 157) AND
> (topical_urls.consolidated_url_id = consolidated_urls.id))"
> "              Filter: (NOT topical_urls.hidden)"
>
>
>
> The index_delta field is double precision and is indexed.
>
> Any suggestions as to how to make this query run faster?
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Sent from my mobile device

------------
pasman

Re: Adding an "and is not null" on an indexed field slows the query down immensely.

От
Tim Uckun
Дата:
2011/1/15 pasman pasmański <pasman.p@gmail.com>:
> Try :
> order by index_delta+1 desc
>

I have attached the explain analyze for that below why does this
return instantly?




Limit  (cost=29910.05..29910.07 rows=10 width=1880) (actual
time=42.563..42.563 rows=0 loops=1)
   ->  Sort  (cost=29910.05..29916.65 rows=2642 width=1880) (actual
time=42.558..42.558 rows=0 loops=1)
         Sort Key: ((consolidated_urls.index_delta + 1::double precision))
         Sort Method:  quicksort  Memory: 17kB
         ->  Nested Loop  (cost=105.30..29852.95 rows=2642 width=1880)
(actual time=10.428..10.428 rows=0 loops=1)
               ->  Bitmap Heap Scan on topical_urls
(cost=105.30..7494.33 rows=2642 width=4) (actual time=10.424..10.424
rows=0 loops=1)
                     Recheck Cond: (domain_id = 157)
                     Filter: (NOT hidden)
                     ->  Bitmap Index Scan on
index_topical_urls_on_domain_id_and_consolidated_url_id
(cost=0.00..104.64 rows=2643 width=0) (actual time=10.419..10.419
rows=0 loops=1)
                           Index Cond: (domain_id = 157)
               ->  Index Scan using consolidated_urls_pkey on
consolidated_urls  (cost=0.00..8.45 rows=1 width=1880) (never
executed)
                     Index Cond: (consolidated_urls.id =
topical_urls.consolidated_url_id)
                     Filter: (consolidated_urls.index_delta IS NOT NULL)
 Total runtime: 42.932 ms
(14 rows)

Re: Adding an "and is not null" on an indexed field slows the query down immensely.

От
pasman pasmański
Дата:
I think this is a planner's bug. Can you send these explains to pgsql-bugs ?

On 1/15/11, Tim Uckun <timuckun@gmail.com> wrote:
> 2011/1/15 pasman pasmański <pasman.p@gmail.com>:
>> Try :
>> order by index_delta+1 desc
>>
>
> I have attached the explain analyze for that below why does this
> return instantly?
>
>
>
>
> Limit  (cost=29910.05..29910.07 rows=10 width=1880) (actual
> time=42.563..42.563 rows=0 loops=1)
>    ->  Sort  (cost=29910.05..29916.65 rows=2642 width=1880) (actual
> time=42.558..42.558 rows=0 loops=1)
>          Sort Key: ((consolidated_urls.index_delta + 1::double precision))
>          Sort Method:  quicksort  Memory: 17kB
>          ->  Nested Loop  (cost=105.30..29852.95 rows=2642 width=1880)
> (actual time=10.428..10.428 rows=0 loops=1)
>                ->  Bitmap Heap Scan on topical_urls
> (cost=105.30..7494.33 rows=2642 width=4) (actual time=10.424..10.424
> rows=0 loops=1)
>                      Recheck Cond: (domain_id = 157)
>                      Filter: (NOT hidden)
>                      ->  Bitmap Index Scan on
> index_topical_urls_on_domain_id_and_consolidated_url_id
> (cost=0.00..104.64 rows=2643 width=0) (actual time=10.419..10.419
> rows=0 loops=1)
>                            Index Cond: (domain_id = 157)
>                ->  Index Scan using consolidated_urls_pkey on
> consolidated_urls  (cost=0.00..8.45 rows=1 width=1880) (never
> executed)
>                      Index Cond: (consolidated_urls.id =
> topical_urls.consolidated_url_id)
>                      Filter: (consolidated_urls.index_delta IS NOT NULL)
>  Total runtime: 42.932 ms
> (14 rows)
>

--
Sent from my mobile device

------------
pasman

Re: Adding an "and is not null" on an indexed field slows the query down immensely.

От
Tim Uckun
Дата:
2011/1/16 pasman pasmański <pasman.p@gmail.com>:
> I think this is a planner's bug. Can you send these explains to pgsql-bugs ?
>



Sure. BTW I thought I would change the query a little by putting a AND
index_value .>100 instead of index_delta and it didn't help at all. I
thought maybe using another index would help but it didn't.

So what I did was add NULLS LAST which was more bearable.

This really should be a pretty quick query, there are only twenty
records after all and all criteria fields are indexed.