Slight change in query leads to unexpected change in query plan

Поиск

Список

Период

Сортировка

От	Jack Orenstein
Тема	Slight change in query leads to unexpected change in query plan
Дата	22 июня 2009 г. 22:05:21
Msg-id	4A3FFDD0.8020809@hds.com обсуждение исходный текст
Ответы	Re: Slight change in query leads to unexpected change in query plan (Sam Mason <sam@samason.me.uk>)
Список	pgsql-general

Дерево обсуждения

I have a table that looks like this:

   create table T(pk int not null, value bytea, ..., primary key(pk))

I want to scan the table in batches of 100. I'll do this by issuing a sequence
of queries like this:

     select *
     from T
     where pk > ?
     and value = ?
     order by pk
     limit 100

After each query, I'll record the last value seen and use that to
drive the next query. The obvious (to me) execution plan is to use the
index, do an index scan, and then filter using the restriction on
value. I have some realistic data (loaded into postgres 8.3.7) and
I've run analyze. I'm not getting a very good execution plan:

ris=# explain
ris-# select *
ris-# from T
ris-# where pk > 1000000000
ris-# and value = 'asdf'::bytea
ris-# order by pk
ris-# limit 100;
                                          QUERY PLAN

---------------------------------------------------------------------------------------------
  Limit  (cost=78352.20..78352.24 rows=16 width=451)
    ->  Sort  (cost=78352.20..78352.24 rows=16 width=451)
          Sort Key: pk
          ->  Bitmap Heap Scan on t  (cost=2091.60..78351.88 rows=16 width=451)
                Recheck Cond: (pk > 1000000000)
                Filter: (value = 'asdf'::bytea)
                ->  Bitmap Index Scan on t_pkey  (cost=0.00..2091.60 rows=91088
width=0)
                      Index Cond: (pk > 1000000000)

But if I remove the value restriction, I get the plan I was hoping for:

ris=# explain
ris-# select *
ris-# from T
ris-# where pk > 1000000000
ris-# order by pk
ris-# limit 100;
                                             QUERY PLAN

--------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..324.99 rows=100 width=451)
    ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
          Index Cond: (pk > 1000000000)
(3 rows)

Why is this?

This is an obvious rewrite, e.g.

     select *
     from (select * from T where pk > ? order by pk limit 100) x
     where value = ?

and this produces a good query plan.  But this means that fewer than
100 rows are returned. For reasons too boring to go into, that would
be very inconvenient for my application.

Why does adding the value restriction so radically change the execution plan?

Jack Orenstein

В списке pgsql-general по дате отправления:

Предыдущее

От: Gerry Reno
Дата: 22 июня 2009 г., 21:53:50
Сообщение: Replication

Следующее

От: Mike Toews
Дата: 22 июня 2009 г., 22:05:26
Сообщение: Re: Select ranges based on sequential breaks

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Slight change in query leads to unexpected change in query plan

Предыдущее

Следующее