Обсуждение: Partial index slower than regular index

Поиск
Список
Период
Сортировка

Partial index slower than regular index

От
Thom Brown
Дата:
I'm using 9.1dev.

Could someone explain the following behaviour?

-- create a test table
CREATE TABLE indextest (id serial, stuff text);

-- insert loads of values with intermittent sets of less common values
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);

-- create regular index
CREATE INDEX indextest_stuff ON indextest(stuff);

-- update table stats
ANALYZE indextest;

postgres=# explain analyze select * from indextest where stuff = 'bark';
                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------
 Index Scan using indextest_stuff on indextest  (cost=0.00..485.09
rows=9076 width=9) (actual time=0.142..3.533 rows=8000 loops=1)
   Index Cond: (stuff = 'bark'::text)
 Total runtime: 4.248 ms
(3 rows)

This is very fast.  Now if I drop the index and add a partial index
with the conditions being tested.

DROP INDEX indextest_stuff;

CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff = 'bark';

postgres=# explain analyze select * from indextest where stuff = 'bark';
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on indextest  (cost=0.00..143386.48 rows=5606 width=9)
(actual time=164.321..1299.794 rows=8000 loops=1)
   Filter: (stuff = 'bark'::text)
 Total runtime: 1300.267 ms
(3 rows)

The index doesn't get used.  There's probably a logical explanation,
which is what I'm curious about.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company1

Re: Partial index slower than regular index

От
Kenneth Marshall
Дата:
On Tue, Apr 05, 2011 at 11:35:29PM +0100, Thom Brown wrote:
> I'm using 9.1dev.
>
> Could someone explain the following behaviour?
>
> -- create a test table
> CREATE TABLE indextest (id serial, stuff text);
>
> -- insert loads of values with intermittent sets of less common values
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
> INSERT INTO indextest (stuff) SELECT 'meow' FROM generate_series (1,1000000);
> INSERT INTO indextest (stuff) SELECT 'bark' FROM generate_series (1,1000);
>
> -- create regular index
> CREATE INDEX indextest_stuff ON indextest(stuff);
>
> -- update table stats
> ANALYZE indextest;
>
> postgres=# explain analyze select * from indextest where stuff = 'bark';
>                                                             QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using indextest_stuff on indextest  (cost=0.00..485.09
> rows=9076 width=9) (actual time=0.142..3.533 rows=8000 loops=1)
>    Index Cond: (stuff = 'bark'::text)
>  Total runtime: 4.248 ms
> (3 rows)
>
> This is very fast.  Now if I drop the index and add a partial index
> with the conditions being tested.
>
> DROP INDEX indextest_stuff;
>
> CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff = 'bark';
>
> postgres=# explain analyze select * from indextest where stuff = 'bark';
>                                                     QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
>  Seq Scan on indextest  (cost=0.00..143386.48 rows=5606 width=9)
> (actual time=164.321..1299.794 rows=8000 loops=1)
>    Filter: (stuff = 'bark'::text)
>  Total runtime: 1300.267 ms
> (3 rows)
>
> The index doesn't get used.  There's probably a logical explanation,
> which is what I'm curious about.
>

The stats seem off. Are you certain that an analyze has run?

Cheers,
Ken

Re: Partial index slower than regular index

От
Scott Marlowe
Дата:
On Tue, Apr 5, 2011 at 4:35 PM, Thom Brown <thom@linux.com> wrote:
> I'm using 9.1dev.
SNIP

> DROP INDEX indextest_stuff;
>
> CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff = 'bark';
>
> postgres=# explain analyze select * from indextest where stuff = 'bark';
>                                                    QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
>  Seq Scan on indextest  (cost=0.00..143386.48 rows=5606 width=9)
> (actual time=164.321..1299.794 rows=8000 loops=1)
>   Filter: (stuff = 'bark'::text)
>  Total runtime: 1300.267 ms
> (3 rows)
>
> The index doesn't get used.  There's probably a logical explanation,
> which is what I'm curious about.

Works fine for me:

explain analyze select * from indextest where stuff = 'bark';
                                                             QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using indextest_stuff on indextest  (cost=0.00..837.01
rows=13347 width=9) (actual time=0.226..6.073 rows=8000 loops=1)
   Index Cond: (stuff = 'bark'::text)
 Total runtime: 7.527 ms

Even with a random_page_cost = 4 it works.  Running 8.3.13 btw.

Re: Partial index slower than regular index

От
Mark Kirkwood
Дата:
On 06/04/11 11:31, Scott Marlowe wrote:
> On Tue, Apr 5, 2011 at 4:35 PM, Thom Brown<thom@linux.com>  wrote:
>> I'm using 9.1dev.
> SNIP
>
>> DROP INDEX indextest_stuff;
>>
>> CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff = 'bark';
>>
>> postgres=# explain analyze select * from indextest where stuff = 'bark';
>>                                                     QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------
>>   Seq Scan on indextest  (cost=0.00..143386.48 rows=5606 width=9)
>> (actual time=164.321..1299.794 rows=8000 loops=1)
>>    Filter: (stuff = 'bark'::text)
>>   Total runtime: 1300.267 ms
>> (3 rows)
>>
>> The index doesn't get used.  There's probably a logical explanation,
>> which is what I'm curious about.
> Works fine for me:
>
> explain analyze select * from indextest where stuff = 'bark';
>                                                               QUERY
> PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>   Index Scan using indextest_stuff on indextest  (cost=0.00..837.01
> rows=13347 width=9) (actual time=0.226..6.073 rows=8000 loops=1)
>     Index Cond: (stuff = 'bark'::text)
>   Total runtime: 7.527 ms
>
> Even with a random_page_cost = 4 it works.  Running 8.3.13 btw.
>

I reproduce what Thom sees - using 9.1dev with default config settings.
Even cranking up effective_cache_size does not encourage the partial
index to be used.

Mark


Re: Partial index slower than regular index

От
Mark Kirkwood
Дата:
On 06/04/11 11:40, Mark Kirkwood wrote:
> On 06/04/11 11:31, Scott Marlowe wrote:
>> On Tue, Apr 5, 2011 at 4:35 PM, Thom Brown<thom@linux.com>  wrote:
>>> I'm using 9.1dev.
>> SNIP
>>
>>> DROP INDEX indextest_stuff;
>>>
>>> CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff = 'bark';
>>>
>>> postgres=# explain analyze select * from indextest where stuff =
>>> 'bark';
>>>                                                     QUERY PLAN
>>> -------------------------------------------------------------------------------------------------------------------

>>>
>>>   Seq Scan on indextest  (cost=0.00..143386.48 rows=5606 width=9)
>>> (actual time=164.321..1299.794 rows=8000 loops=1)
>>>    Filter: (stuff = 'bark'::text)
>>>   Total runtime: 1300.267 ms
>>> (3 rows)
>>>
>>> The index doesn't get used.  There's probably a logical explanation,
>>> which is what I'm curious about.
>> Works fine for me:
>>
>> explain analyze select * from indextest where stuff = 'bark';
>>                                                               QUERY
>> PLAN
>>
------------------------------------------------------------------------------------------------------------------------------------

>>
>>   Index Scan using indextest_stuff on indextest  (cost=0.00..837.01
>> rows=13347 width=9) (actual time=0.226..6.073 rows=8000 loops=1)
>>     Index Cond: (stuff = 'bark'::text)
>>   Total runtime: 7.527 ms
>>
>> Even with a random_page_cost = 4 it works.  Running 8.3.13 btw.
>>
>
> I reproduce what Thom sees - using 9.1dev with default config
> settings. Even cranking up effective_cache_size does not encourage the
> partial index to be used.
>
>


However trying with 9.0 gives me the (expected) same 8.3 behaviour:


test=# CREATE INDEX indextest_stuff ON indextest(stuff) WHERE stuff =
'bark';
CREATE INDEX

test=# explain analyze select * from indextest where stuff = 'bark';
                                                             QUERY PLAN
--------------------------------------------------------------------------------------------
  Index Scan using indextest_stuff on indextest  (cost=0.00..284.20
rows=5873 width=9)
                                                 (actual
time=0.276..9.621 rows=8000 loops=1)
    Index Cond: (stuff = 'bark'::text)
  Total runtime: 16.621 ms
(3 rows)


regards

Mark



Re: Partial index slower than regular index

От
Tom Lane
Дата:
Thom Brown <thom@linux.com> writes:
> The index doesn't get used.  There's probably a logical explanation,
> which is what I'm curious about.

Er ... it's broken?

It looks like the index predicate expression isn't getting the right
collation assigned, so predtest.c decides the query doesn't imply the
index's predicate.  Too tired to look into exactly why right now, but
it's clearly bound up in all the recent collation changes.

            regards, tom lane

Re: Partial index slower than regular index

От
Thom Brown
Дата:
On 6 April 2011 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> The index doesn't get used.  There's probably a logical explanation,
>> which is what I'm curious about.
>
> Er ... it's broken?
>
> It looks like the index predicate expression isn't getting the right
> collation assigned, so predtest.c decides the query doesn't imply the
> index's predicate.  Too tired to look into exactly why right now, but
> it's clearly bound up in all the recent collation changes.

Testing it again with very explicit collations, it still has issues:

CREATE INDEX indextest_stuff ON indextest(stuff COLLATE "en_GB.UTF-8")
WHERE stuff COLLATE "en_GB.UTF-8" = 'bark' COLLATE "en_GB.UTF-8";

postgres=# explain analyze select * from indextest where stuff collate
"en_GB.UTF-8" = 'bark' collate "en_GB.UTF-8";
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on indextest  (cost=0.00..143387.00 rows=8312 width=9)
(actual time=163.759..1308.316 rows=8000 loops=1)
   Filter: ((stuff)::text = 'bark'::text COLLATE "en_GB.UTF-8")
 Total runtime: 1308.821 ms
(3 rows)

But I'm possibly missing the point here.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial index slower than regular index

От
Thom Brown
Дата:
On 6 April 2011 00:02, Kenneth Marshall <ktm@rice.edu> wrote:
> The stats seem off. Are you certain that an analyze has run?
>
> Cheers,
> Ken
>

Yes, an ANALYZE was definitely run against the table.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial index slower than regular index

От
Tom Lane
Дата:
Thom Brown <thom@linux.com> writes:
> On 6 April 2011 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It looks like the index predicate expression isn't getting the right
>> collation assigned, so predtest.c decides the query doesn't imply the
>> index's predicate. �Too tired to look into exactly why right now, but
>> it's clearly bound up in all the recent collation changes.

> Testing it again with very explicit collations, it still has issues:

Yeah, any sort of collation-sensitive operator in an index WHERE clause
was just plain broken.  Fixed now.

            regards, tom lane

Re: Partial index slower than regular index

От
Thom Brown
Дата:
On 7 April 2011 07:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> On 6 April 2011 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It looks like the index predicate expression isn't getting the right
>>> collation assigned, so predtest.c decides the query doesn't imply the
>>> index's predicate.  Too tired to look into exactly why right now, but
>>> it's clearly bound up in all the recent collation changes.
>
>> Testing it again with very explicit collations, it still has issues:
>
> Yeah, any sort of collation-sensitive operator in an index WHERE clause
> was just plain broken.  Fixed now.

Thanks Tom.

You said in the commit message that an initdb isn't required, but is
there anything else since 20th March that would cause cluster files to
break compatibility?  I'm now getting the following message:

toucan:postgresql thom$ pg_ctl start
server starting
toucan:postgresql thom$ FATAL:  database files are incompatible with server
DETAIL:  The database cluster was initialized with CATALOG_VERSION_NO
201103201, but the server was compiled with CATALOG_VERSION_NO
201104051.
HINT:  It looks like you need to initdb.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial index slower than regular index

От
Thom Brown
Дата:
On 7 April 2011 08:10, Thom Brown <thom@linux.com> wrote:
> On 7 April 2011 07:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thom Brown <thom@linux.com> writes:
>>> On 6 April 2011 05:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> It looks like the index predicate expression isn't getting the right
>>>> collation assigned, so predtest.c decides the query doesn't imply the
>>>> index's predicate.  Too tired to look into exactly why right now, but
>>>> it's clearly bound up in all the recent collation changes.
>>
>>> Testing it again with very explicit collations, it still has issues:
>>
>> Yeah, any sort of collation-sensitive operator in an index WHERE clause
>> was just plain broken.  Fixed now.
>
> Thanks Tom.
>
> You said in the commit message that an initdb isn't required, but is
> there anything else since 20th March that would cause cluster files to
> break compatibility?  I'm now getting the following message:
>
> toucan:postgresql thom$ pg_ctl start
> server starting
> toucan:postgresql thom$ FATAL:  database files are incompatible with server
> DETAIL:  The database cluster was initialized with CATALOG_VERSION_NO
> 201103201, but the server was compiled with CATALOG_VERSION_NO
> 201104051.
> HINT:  It looks like you need to initdb.

Nevermind.  This was caused by "Add casts from int4 and int8 to numeric.".

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company