Weird case of wrong index choice

Поиск

Список

Период

Сортировка

От	Claudio Freire
Тема	Weird case of wrong index choice
Дата	4 сентября 2013 г. 00:47:15
Msg-id	CAGTBQpYq7KbYTb1PTxshLZJT5Ntpnfy4nNYBP1HDAGo2=G2Z3A@mail.gmail.com обсуждение исходный текст
Ответы	Re: Weird case of wrong index choice
Список	pgsql-performance

Дерево обсуждения

This is postgres 9.1.9.

I'm getting a very weird case in which a simple range query over a PK
picks the wrong... the very very wrong index.

The interesting thing, is that I've got no idea why PG is so grossly
mis-estimating the number of rows scanned by the wrong plan.

I've got this table that's a bunch of counters grouped by a bunch of
other columns, and a date.

The PK is a simple serial type, and the unique index you'll see is
over (created, expr1, expr2, ... expr2) where created is the date, and
exprN are expressions involving the grouping columns.

So, I've got this query with this very wrong plan:

explain SELECT min(created) < ((date_trunc('day',now()) - '90
days'::interval)) FROM "aggregated_tracks_daily_full" WHERE id BETWEEN
34979048 AND 35179048
;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=795.24..795.26 rows=1 width=0)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..795.24 rows=1 width=8)
           ->  Index Scan using ix_aggregated_tracks_daily_full_unq on
aggregated_tracks_daily_full  (cost=0.00..57875465.87 rows=72777
width=8)
                 Index Cond: (created IS NOT NULL)
                 Filter: ((id >= 34979048) AND (id <= 35179048))
(6 rows)


That plan will scan the entire table, because there is NO row with
created null. I've got no idea why PG is choosing to scan over the
unique index, given that 1) there's 0 rows outside the index
condition, so it'll scan the entire table, and 2) i've analyzed and
vacuum analyzed repeatedly, no chage, and 3) there's the PK index that
works flawless.

The table is very big. So scanning it entriely in random fashion isn't smart.

I can force the right plan like this:

mat=# explain SELECT min(created) < ((date_trunc('day',now()) - '90
days'::interval)) FROM (select id,created FROM
"aggregated_tracks_daily_full" WHERE id BETWEEN 34979048 AND 35179048
ORDER BY id) t;
                                                             QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=89416.49..89416.51 rows=1 width=8)
   ->  Index Scan using aggregated_tracks_daily_full_pkey on
aggregated_tracks_daily_full  (cost=0.00..88506.78 rows=72777
width=16)
         Index Cond: ((id >= 34979048) AND (id <= 35179048))
(3 rows)


But i'm wondering why I have to. There's something hinky there.

PS: The point of the query is to know whether there's something to be
"archived" (ie, too old) in that id range.

В списке pgsql-performance по дате отправления:

Предыдущее

От: Kevin Grittner
Дата: 03 сентября 2013 г., 23:34:37
Сообщение: Re: COPY TO and VACUUM

Следующее

От: Milos Babic
Дата: 04 сентября 2013 г., 01:16:55
Сообщение: apply

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Weird case of wrong index choice

Предыдущее

Следующее