Обсуждение: Slight change in query leads to unexpected change in query plan

Поиск
Список
Период
Сортировка

Slight change in query leads to unexpected change in query plan

От
Jack Orenstein
Дата:
I have a table that looks like this:

   create table T(pk int not null, value bytea, ..., primary key(pk))

I want to scan the table in batches of 100. I'll do this by issuing a sequence
of queries like this:

     select *
     from T
     where pk > ?
     and value = ?
     order by pk
     limit 100

After each query, I'll record the last value seen and use that to
drive the next query. The obvious (to me) execution plan is to use the
index, do an index scan, and then filter using the restriction on
value. I have some realistic data (loaded into postgres 8.3.7) and
I've run analyze. I'm not getting a very good execution plan:

ris=# explain
ris-# select *
ris-# from T
ris-# where pk > 1000000000
ris-# and value = 'asdf'::bytea
ris-# order by pk
ris-# limit 100;
                                          QUERY PLAN

---------------------------------------------------------------------------------------------
  Limit  (cost=78352.20..78352.24 rows=16 width=451)
    ->  Sort  (cost=78352.20..78352.24 rows=16 width=451)
          Sort Key: pk
          ->  Bitmap Heap Scan on t  (cost=2091.60..78351.88 rows=16 width=451)
                Recheck Cond: (pk > 1000000000)
                Filter: (value = 'asdf'::bytea)
                ->  Bitmap Index Scan on t_pkey  (cost=0.00..2091.60 rows=91088
width=0)
                      Index Cond: (pk > 1000000000)

But if I remove the value restriction, I get the plan I was hoping for:

ris=# explain
ris-# select *
ris-# from T
ris-# where pk > 1000000000
ris-# order by pk
ris-# limit 100;
                                             QUERY PLAN

--------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..324.99 rows=100 width=451)
    ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
          Index Cond: (pk > 1000000000)
(3 rows)

Why is this?

This is an obvious rewrite, e.g.

     select *
     from (select * from T where pk > ? order by pk limit 100) x
     where value = ?

and this produces a good query plan.  But this means that fewer than
100 rows are returned. For reasons too boring to go into, that would
be very inconvenient for my application.

Why does adding the value restriction so radically change the execution plan?

Jack Orenstein

Re: Slight change in query leads to unexpected change in query plan

От
Sam Mason
Дата:
On Mon, Jun 22, 2009 at 05:55:28PM -0400, Jack Orenstein wrote:
> ris-# select *
> ris-# from T
> ris-# where pk > 1000000000
> ris-# and value = 'asdf'::bytea
> ris-# order by pk
> ris-# limit 100;

PG thinks that you're going to get 16 rows back matching those
conditions, bitmap heap scans are faster in some cases and this is
likely to be one of those cases so PG is optimizing things correctly.

>  Limit  (cost=78352.20..78352.24 rows=16 width=451)

> ris-# select *
> ris-# from T
> ris-# where pk > 1000000000
> ris-# order by pk
> ris-# limit 100;

With this query, PG thinks that you may get 91088 rows back but because
you've got a LIMIT in there you only needs the first 100 of them.  It
will therefore prefer a plan that will stop short and thus is preferring
an index scan.

>  Limit  (cost=0.00..324.99 rows=100 width=451)
>    ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)


> Why does adding the value restriction so radically change the execution
> plan?

PG doesn't have any cross column statistics and hence it assumes that pk
and value are uncorrelated.  You may get better results with increasing
the statistics target[1] for those columns as that will give PG more
information, but if the columns are indeed correlated then that's not
going to help.

--
  Sam  http://samason.me.uk/

 [1] http://www.postgresql.org/docs/current/static/sql-altertable.html

Re: Slight change in query leads to unexpected change in query plan

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> On Mon, Jun 22, 2009 at 05:55:28PM -0400, Jack Orenstein wrote:
>> Why does adding the value restriction so radically change the execution
>> plan?

> PG doesn't have any cross column statistics and hence it assumes that pk
> and value are uncorrelated.

Even if they are correlated, they aren't necessarily correlated in a
good way; the query plan Jack is hoping for could easily be pessimal.
We routinely see complaints where the planner does what he's hoping for
and gets burnt ...

If the speed of this particular type of query is critical, it might be
worth maintaining a 2-column index on (value, pk).  That would provide
guaranteed good performance since the desired rows form a contiguous run
in the index.

            regards, tom lane

Re: Slight change in query leads to unexpected change in query plan

От
Jack Orenstein
Дата:
Sam Mason wrote:
> On Mon, Jun 22, 2009 at 05:55:28PM -0400, Jack Orenstein wrote:
>> ris-# select *
>> ris-# from T
>> ris-# where pk > 1000000000
>> ris-# and value = 'asdf'::bytea
>> ris-# order by pk
>> ris-# limit 100;
>
> PG thinks that you're going to get 16 rows back matching those
> conditions, bitmap heap scans are faster in some cases and this is
> likely to be one of those cases so PG is optimizing things correctly.

But that's 16 rows at the end. I'm concerned about the earlier processing.
Here's the plan:

   Limit  (cost=78352.20..78352.24 rows=16 width=451)
     ->  Sort  (cost=78352.20..78352.24 rows=16 width=451)
           Sort Key: pk
           ->  Bitmap Heap Scan on t  (cost=2091.60..78351.88 rows=16 width=451)
                 Recheck Cond: (pk > 1000000000)
                 Filter: (value = 'asdf'::bytea)
                 ->  Bitmap Index Scan on t_pkey  (cost=0.00..2091.60 rows=91088
width=0)
                       Index Cond: (pk > 1000000000)

If I'm reading this right, the optimizer estimates that the bitmap index scan
will identify 91088 rows, which all have to be retrieved from the heap for
evaluation of the restriction on value. The row width is 451, so even if the
rows are packed into 8k pages perfectly, that's ~5000 pages that have to be read.

If there is an index scan (non-bitmap) with the value restriction checked at the
end, we know the optimizer thinks that is cheaper. Here is the plan without the
value restriction:

   Limit  (cost=0.00..324.99 rows=100 width=451)
     ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
           Index Cond: (pk > 1000000000)

Adding the value restriction at the top of this query plan wouldn't increase the
cost very much.

So given that estimate of 91088 rows coming out of the bitmap index scan, I
still don't understand the optimizer's reasoning. Am I reading the plan incorrectly?

Jack

>
>>  Limit  (cost=78352.20..78352.24 rows=16 width=451)
>
>> ris-# select *
>> ris-# from T
>> ris-# where pk > 1000000000
>> ris-# order by pk
>> ris-# limit 100;
>
> With this query, PG thinks that you may get 91088 rows back but because
> you've got a LIMIT in there you only needs the first 100 of them.  It
> will therefore prefer a plan that will stop short and thus is preferring
> an index scan.
>
>>  Limit  (cost=0.00..324.99 rows=100 width=451)
>>    ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
>
>
>> Why does adding the value restriction so radically change the execution
>> plan?
>
> PG doesn't have any cross column statistics and hence it assumes that pk
> and value are uncorrelated.  You may get better results with increasing
> the statistics target[1] for those columns as that will give PG more
> information, but if the columns are indeed correlated then that's not
> going to help.
>


Re: Slight change in query leads to unexpected change in query plan

От
Tom Lane
Дата:
Jack Orenstein <jack.orenstein@hds.com> writes:
>    Limit  (cost=0.00..324.99 rows=100 width=451)
>      ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
>            Index Cond: (pk > 1000000000)

> Adding the value restriction at the top of this query plan wouldn't increase the
> cost very much.

You're missing the point: with the value restriction in place, it's
estimating that it will probably have to scan all 91000 rows (because
there are fewer than 100 satisfying the value restriction).  And that
is going to cost somewhere north of 296027 cost units --- the cost
shown, plus 91000 invocations of the value-restriction check.
Which is considerably more than the cost of the other plan.

            regards, tom lane

Re: Slight change in query leads to unexpected change in query plan

От
Jack Orenstein
Дата:
Tom Lane wrote:
> Jack Orenstein <jack.orenstein@hds.com> writes:
>>    Limit  (cost=0.00..324.99 rows=100 width=451)
>>      ->  Index Scan using t_pkey on t  (cost=0.00..296027.98 rows=91088 width=451)
>>            Index Cond: (pk > 1000000000)
>
>> Adding the value restriction at the top of this query plan wouldn't increase the
>> cost very much.
>
> You're missing the point: with the value restriction in place, it's
> estimating that it will probably have to scan all 91000 rows (because
> there are fewer than 100 satisfying the value restriction).  And that
> is going to cost somewhere north of 296027 cost units --- the cost
> shown, plus 91000 invocations of the value-restriction check.
> Which is considerably more than the cost of the other plan.

I see -- the optimizer is calculating that it will have to examine a very large
fraction of the rows.

Actually, pk and value are highly correlated. If a row gets past the index scan,
odds are very high that the value predicate will evaluate to true. So I am sure
that the index scan is the right way to go. I'm just trying to convince the
optimizer of this.

One thing I am considering is introducing a function with high execution cost.
E.g., if I do this:

     create function return_input(bytea) returns bytea
     as '
     begin
         return $1;
     end;
     ' language 'plpgsql' cost 10000;

     explain
     select *
     from t
     where pk > 1000000
     and return_input(value = 'asdf'::bytea)
     order by pk
     limit 100;

then I get the plan I want.

  Limit  (cost=0.00..563490.32 rows=100 width=451)
    ->  Index Scan using t_pkey on t  (cost=0.00..34226402.07 rows=6074 width=451)
          Index Cond: (pk > 1000000)
          Filter: (return_input(value) = 'asdf'::bytea)


Is there a more elegant way of forcing the plan that I want?

Jack