How to get the optimizer to use an index with multiple fields

Поиск

Список

Период

Сортировка

От	Jim C. Nasby
Тема	How to get the optimizer to use an index with multiple fields
Дата	28 апреля 2003 г. 14:21:51
Msg-id	20030428132143.J66185@flake.decibel.org обсуждение
Ответы	Re: How to get the optimizer to use an index with multiple fields Re: How to get the optimizer to use an index with multiple
Список	pgsql-performance

Дерево обсуждения

select id into temp NewRetires from stats_participant where retire_to>=1
AND retire_date = (SELECT last_date FROM Project_statsrun WHERE
project_id = :ProjectID);

results in a table with 5 values...

explain analyze delete from email_rank where project_id=25 and id in
(select id from NewRetires);

 Index Scan using email_rank__day_rank on email_rank
(cost=0.00..9003741627715.16 rows=45019 width=6) (actual time=408.12..9688.37 rows=3 loops=1)
   Index Cond: (project_id = 25)
   Filter: (subplan)
   SubPlan
     ->  Seq Scan on newretires  (cost=100000000.00..100000020.00 rows=1000 width=4) (actual time=0.01..0.05 rows=5
loops=91834)
 Total runtime: 9689.86 msec

But, there's already an index that would fit the bill here perfectly:

              Table "public.email_rank"
        Column         |  Type   |     Modifiers
-----------------------+---------+--------------------
 project_id            | integer | not null
 id                    | integer | not null
 first_date            | date    | not null
 last_date             | date    | not null
 day_rank              | integer | not null default 0
 day_rank_previous     | integer | not null default 0
 overall_rank          | integer | not null default 0
 overall_rank_previous | integer | not null default 0
 work_today            | bigint  | not null default 0
 work_total            | bigint  | not null default 0
Indexes: email_rank_pkey primary key btree (project_id, id),
         email_rank__day_rank btree (project_id, day_rank),
         email_rank__overall_rank btree (project_id, overall_rank)

Why isn't it using email_rank_pkey instead of using day_rank then a
filter? The original query on sybase (see below) is essentially instant,
because it's using the index of (project_id, id), so it doesn't have to
read the whole table.

stats=> select project_id,count(*) from email_rank group by project_id;
 project_id | count
------------+--------
          5 | 327856
          8 |  28304
         24 |  34622
         25 |  91834
        205 | 331464

Also, changing the WHERE IN to a WHERE EXISTS in the delete is
substantially faster in this case (3.5 seconds as opposed to 9); it
would be nice if the optimizer could rewrite the query on-the-fly. I
started looking into this in the first place because the original query
was taking 6-10 seconds, which seemed too long...

Original query:
DELETE FROM Email_Rank
    WHERE project_id = :ProjectID
        AND id IN  (SELECT id
                        FROM STATS_Participant sp
                        WHERE retire_to >= 1
                            AND retire_date = (SELECT last_date FROM Project_statsrun WHERE project_id = :ProjectID)
                    )
;

I tried changing this to an EXISTS and it takes over a minute. So in
this case, the range of runtimes is ~4 seconds (building the temp table
takes ~0.25 seconds) to over a minute.
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

How to get the optimizer to use an index with multiple fields