Обсуждение: BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

Поиск
Список
Период
Сортировка

BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

От
maxim.boguk@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      9135
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 9.3.2
Operating system:   Linux
Description:

Hi,

One of my customers have very curious situation with simple query and index
usage. I tried different ideas but it doesn't work anyway and now I out of
ideas. It's looks like a bug if I not missing something.

Detail:
4GB liexWebmasterProducts table with interesting fields:

 lwpid                         | integer                     | not null
default nextval('liexwebmasterproducts_lwpid_seq'::regclass)
 lwpname                       | text                        |
...
 lwpwebsiteid                  | integer                     |
...
 lwpnotforsale                 | boolean                     | not null
...
 lwpcreatedate                 | timestamp without time zone | not null
default now()
...

Index on the last three fields defined as:
    "i_liexwebmasterproducts_2" btree (lwpwebsiteid, lwpnotforsale,
lwpcreatedate)

Target query and plan:
select *
from liexWebmasterProducts this_
where
this_.lwpWebsiteId=5935
and this_.lwpnotForSale=FALSE
order by this_.lwpCreateDate desc limit 1;

 Limit  (cost=122.18..122.19 rows=1 width=902) (actual time=13.505..13.506
rows=1 loops=1)
   ->  Sort  (cost=122.18..124.57 rows=953 width=902) (actual
time=13.503..13.503 rows=1 loops=1)
         Sort Key: lwpcreatedate
         Sort Method: top-N heapsort  Memory: 27kB
         ->  Index Scan using i_liexwebmasterproducts_2 on
liexwebmasterproducts this_  (cost=0.43..117.42 rows=953 width=902) (actual
time=0.171..10.429 rows=1674 loops=1)
               Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
false))
               Filter: (NOT lwpnotforsale)
 Total runtime: 13.626 ms


I have no idea why Postgresql doesn't want use simple index scan over 3
fields...
set enable_sort to 0;
have no effect:

 Limit  (cost=10000000119.90..10000000119.90 rows=1 width=902) (actual
time=6.591..6.592 rows=1 loops=1)
   ->  Sort  (cost=10000000119.90..10000000122.24 rows=935 width=902)
(actual time=6.588..6.588 rows=1 loops=1)
         Sort Key: lwpcreatedate
         Sort Method: top-N heapsort  Memory: 27kB
         ->  Index Scan using i_liexwebmasterproducts_2 on
liexwebmasterproducts this_  (cost=0.43..115.22 rows=935 width=902) (actual
time=0.050..3.733 rows=1673 loops=1)
               Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
false))
               Filter: (NOT lwpnotforsale)
 Total runtime: 6.670 ms


It seems somehow related to the: "Filter: (NOT lwpnotforsale)" part of the
query which look like redundant, but it's my pure guessing.

Reindexing the index, vacuum analyze table - provide zero effect on the
plan.

Generating whole new subset and table via:
shop=# create table test as select lwpid,lwpwebsiteid, lwpnotforsale,
lwpcreatedate from liexwebmasterproducts;
SELECT 6799176
shop=#  create index CONCURRENTLY test_index_1 on test(lwpwebsiteid,
lwpnotforsale, lwpcreatedate);
CREATE INDEX
shop=# vacuum analyze test;
VACUUM

Have no effect as well (plan over test table stay the same).

Changing order of the two first fields in index via:
create index CONCURRENTLY test_index_2 on test(lwpnotforsale, lwpwebsiteid,
wpcreatedate);
have no effect on the plan too.

Kindly Regards,
Maksym

Re: BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

От
Harry Rossignol
Дата:
I would try -

select *  from liexWebmasterProducts this_  where
this_.lwpWebsiteI_.lwpnotForSale
lwpWebsiteId,.lwpCreateDate desc limit 1;




2/6/2014 8:55 PM, maxim.boguk@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      9135
> Logged by:          Maxim Boguk
> Email address:      maxim.boguk@gmail.com
> PostgreSQL version: 9.3.2
> Operating system:   Linux
> Description:
>
> Hi,
>
> One of my customers have very curious situation with simple query and index
> usage. I tried different ideas but it doesn't work anyway and now I out of
> ideas. It's looks like a bug if I not missing something.
>
> Detail:
> 4GB liexWebmasterProducts table with interesting fields:
>
>   lwpid                         | integer                     | not null
> default nextval('liexwebmasterproducts_lwpid_seq'::regclass)
>   lwpname                       | text                        |
> ...
>   lwpwebsiteid                  | integer                     |
> ...
>   lwpnotforsale                 | boolean                     | not null
> ...
>   lwpcreatedate                 | timestamp without time zone | not null
> default now()
> ...
>
> Index on the last three fields defined as:
>      "i_liexwebmasterproducts_2" btree (lwpwebsiteid, lwpnotforsale,
> lwpcreatedate)
>
> Target query and plan:
> select *
> from liexWebmasterProducts this_
> where
> this_.lwpWebsiteId=5935
> and this_.lwpnotForSale=FALSE
> order by this_.lwpCreateDate desc limit 1;
>
>   Limit  (cost=122.18..122.19 rows=1 width=902) (actual time=13.505..13.506
> rows=1 loops=1)
>     ->  Sort  (cost=122.18..124.57 rows=953 width=902) (actual
> time=13.503..13.503 rows=1 loops=1)
>           Sort Key: lwpcreatedate
>           Sort Method: top-N heapsort  Memory: 27kB
>           ->  Index Scan using i_liexwebmasterproducts_2 on
> liexwebmasterproducts this_  (cost=0.43..117.42 rows=953 width=902) (actual
> time=0.171..10.429 rows=1674 loops=1)
>                 Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
> false))
>                 Filter: (NOT lwpnotforsale)
>   Total runtime: 13.626 ms
>
>
> I have no idea why Postgresql doesn't want use simple index scan over 3
> fields...
> set enable_sort to 0;
> have no effect:
>
>   Limit  (cost=10000000119.90..10000000119.90 rows=1 width=902) (actual
> time=6.591..6.592 rows=1 loops=1)
>     ->  Sort  (cost=10000000119.90..10000000122.24 rows=935 width=902)
> (actual time=6.588..6.588 rows=1 loops=1)
>           Sort Key: lwpcreatedate
>           Sort Method: top-N heapsort  Memory: 27kB
>           ->  Index Scan using i_liexwebmasterproducts_2 on
> liexwebmasterproducts this_  (cost=0.43..115.22 rows=935 width=902) (actual
> time=0.050..3.733 rows=1673 loops=1)
>                 Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
> false))
>                 Filter: (NOT lwpnotforsale)
>   Total runtime: 6.670 ms
>
>
> It seems somehow related to the: "Filter: (NOT lwpnotforsale)" part of the
> query which look like redundant, but it's my pure guessing.
>
> Reindexing the index, vacuum analyze table - provide zero effect on the
> plan.
>
> Generating whole new subset and table via:
> shop=# create table test as select lwpid,lwpwebsiteid, lwpnotforsale,
> lwpcreatedate from liexwebmasterproducts;
> SELECT 6799176
> shop=#  create index CONCURRENTLY test_index_1 on test(lwpwebsiteid,
> lwpnotforsale, lwpcreatedate);
> CREATE INDEX
> shop=# vacuum analyze test;
> VACUUM
>
> Have no effect as well (plan over test table stay the same).
>
> Changing order of the two first fields in index via:
> create index CONCURRENTLY test_index_2 on test(lwpnotforsale, lwpwebsiteid,
> wpcreatedate);
> have no effect on the plan too.
>
> Kindly Regards,
> Maksym
>
>
>
>
>

Re: BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

От
Harry Rossignol
Дата:
On 2/6/2014 9:13 PM, Harry Rossignol wrote:
> I would try -
>
> select *  from liexWebmasterProducts this_  where
> this_.lwpWebsiteI_.lwpnotForSale
> lwpWebsiteId,.lwpCreateDate desc
ORDER BY lwpWebsiteI,_.lwpnotForSale , lwpWebsiteId,.lwpCreateDate
> Limit 1;
>
>
>
>
> 2/6/2014 8:55 PM, maxim.boguk@gmail.com wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:      9135
>> Logged by:          Maxim Boguk
>> Email address:      maxim.boguk@gmail.com
>> PostgreSQL version: 9.3.2
>> Operating system:   Linux
>> Description:
>>
>> Hi,
>>
>> One of my customers have very curious situation with simple query and
>> index
>> usage. I tried different ideas but it doesn't work anyway and now I
>> out of
>> ideas. It's looks like a bug if I not missing something.
>>
>> Detail:
>> 4GB liexWebmasterProducts table with interesting fields:
>>
>>   lwpid                         | integer                     | not null
>> default nextval('liexwebmasterproducts_lwpid_seq'::regclass)
>>   lwpname                       | text                        |
>> ...
>>   lwpwebsiteid                  | integer                     |
>> ...
>>   lwpnotforsale                 | boolean                     | not null
>> ...
>>   lwpcreatedate                 | timestamp without time zone | not null
>> default now()
>> ...
>>
>> Index on the last three fields defined as:
>>      "i_liexwebmasterproducts_2" btree (lwpwebsiteid, lwpnotforsale,
>> lwpcreatedate)
>>
>> Target query and plan:
>> select *
>> from liexWebmasterProducts this_
>> where
>> this_.lwpWebsiteId=5935
>> and this_.lwpnotForSale=FALSE
>> order by this_.lwpCreateDate desc limit 1;
>>
>>   Limit  (cost=122.18..122.19 rows=1 width=902) (actual
>> time=13.505..13.506
>> rows=1 loops=1)
>>     ->  Sort  (cost=122.18..124.57 rows=953 width=902) (actual
>> time=13.503..13.503 rows=1 loops=1)
>>           Sort Key: lwpcreatedate
>>           Sort Method: top-N heapsort  Memory: 27kB
>>           ->  Index Scan using i_liexwebmasterproducts_2 on
>> liexwebmasterproducts this_  (cost=0.43..117.42 rows=953 width=902)
>> (actual
>> time=0.171..10.429 rows=1674 loops=1)
>>                 Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
>> false))
>>                 Filter: (NOT lwpnotforsale)
>>   Total runtime: 13.626 ms
>>
>>
>> I have no idea why Postgresql doesn't want use simple index scan over 3
>> fields...
>> set enable_sort to 0;
>> have no effect:
>>
>>   Limit  (cost=10000000119.90..10000000119.90 rows=1 width=902) (actual
>> time=6.591..6.592 rows=1 loops=1)
>>     ->  Sort  (cost=10000000119.90..10000000122.24 rows=935 width=902)
>> (actual time=6.588..6.588 rows=1 loops=1)
>>           Sort Key: lwpcreatedate
>>           Sort Method: top-N heapsort  Memory: 27kB
>>           ->  Index Scan using i_liexwebmasterproducts_2 on
>> liexwebmasterproducts this_  (cost=0.43..115.22 rows=935 width=902)
>> (actual
>> time=0.050..3.733 rows=1673 loops=1)
>>                 Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
>> false))
>>                 Filter: (NOT lwpnotforsale)
>>   Total runtime: 6.670 ms
>>
>>
>> It seems somehow related to the: "Filter: (NOT lwpnotforsale)" part
>> of the
>> query which look like redundant, but it's my pure guessing.
>>
>> Reindexing the index, vacuum analyze table - provide zero effect on the
>> plan.
>>
>> Generating whole new subset and table via:
>> shop=# create table test as select lwpid,lwpwebsiteid, lwpnotforsale,
>> lwpcreatedate from liexwebmasterproducts;
>> SELECT 6799176
>> shop=#  create index CONCURRENTLY test_index_1 on test(lwpwebsiteid,
>> lwpnotforsale, lwpcreatedate);
>> CREATE INDEX
>> shop=# vacuum analyze test;
>> VACUUM
>>
>> Have no effect as well (plan over test table stay the same).
>>
>> Changing order of the two first fields in index via:
>> create index CONCURRENTLY test_index_2 on test(lwpnotforsale,
>> lwpwebsiteid,
>> wpcreatedate);
>> have no effect on the plan too.
>>
>> Kindly Regards,
>> Maksym
>>
>>
>>
>>
>>
>
>
>

Re: BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

От
Tom Lane
Дата:
maxim.boguk@gmail.com writes:
> Index on the last three fields defined as:
>     "i_liexwebmasterproducts_2" btree (lwpwebsiteid, lwpnotforsale, lwpcreatedate)

> Target query and plan:
> select *
> from liexWebmasterProducts this_
> where
> this_.lwpWebsiteId=5935
> and this_.lwpnotForSale=FALSE
> order by this_.lwpCreateDate desc limit 1;

>  Limit  (cost=122.18..122.19 rows=1 width=902) (actual time=13.505..13.506
> rows=1 loops=1)
>    ->  Sort  (cost=122.18..124.57 rows=953 width=902) (actual
> time=13.503..13.503 rows=1 loops=1)
>          Sort Key: lwpcreatedate
>          Sort Method: top-N heapsort  Memory: 27kB
>          ->  Index Scan using i_liexwebmasterproducts_2 on
> liexwebmasterproducts this_  (cost=0.43..117.42 rows=953 width=902) (actual
> time=0.171..10.429 rows=1674 loops=1)
>                Index Cond: ((lwpwebsiteid = 5935) AND (lwpnotforsale =
> false))
>                Filter: (NOT lwpnotforsale)
>  Total runtime: 13.626 ms

As a workaround you could do

select *
from liexWebmasterProducts this_
where
this_.lwpWebsiteId=5935
and this_.lwpnotForSale=FALSE
order by this_.lwpnotForSale desc, this_.lwpCreateDate desc limit 1;

The problem is that "this_.lwpnotForSale=FALSE" is canonicalized into
"NOT this_.lwpnotForSale", and then the ORDER BY machinery fails to
realize that that makes the index column a no-op for ordering purposes.
It does work as you're expecting for index columns of non-boolean types.

I'll see about fixing this, but considering that it's worked like that
since about 8.1 without complaints, I don't think I'll risk back-patching
the change.

            regards, tom lane

Re: BUG #9135: PostgreSQL doesn't want use index scan instead of (index scan+sort+limit)

От
Sergey Konoplev
Дата:
On Fri, Feb 7, 2014 at 12:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As a workaround you could do
>
> select *
> from liexWebmasterProducts this_
> where
> this_.lwpWebsiteId=5935
> and this_.lwpnotForSale=FALSE
> order by this_.lwpnotForSale desc, this_.lwpCreateDate desc limit 1;
>
> The problem is that "this_.lwpnotForSale=FALSE" is canonicalized into
> "NOT this_.lwpnotForSale", and then the ORDER BY machinery fails to
> realize that that makes the index column a no-op for ordering purposes.
> It does work as you're expecting for index columns of non-boolean types.
>
> I'll see about fixing this, but considering that it's worked like that
> since about 8.1 without complaints, I don't think I'll risk back-patching
> the change.

+1 for fixing this. From my practice people face this issue quite
often. In the most of the cases it can be solved by just creating a
partial index based on boolean condition, but time from time it can
not, some time bringing a huge head ache.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com