Обсуждение: 9.5alpha1: Partial index not used
Consider this table:
wdsold=3D> \d concept
Table "public.concept"
Column | Type | Mod=
ifiers =20
-------------------+-----------------------------+-------------------------=
-----------------------------
id | integer | not null default nextval=
('concept_id_seq'::regclass)
canonicalname | character varying |=20
description | character varying |=20
start | boolean |=20
hidden | boolean |=20
sortorder | integer |=20
valid_from | timestamp without time zone | not null default now()
from_job_queue_id | integer |=20
Indexes:
"concept_pkey" PRIMARY KEY, btree (id)
"concept_canonicalname_idx" btree (canonicalname)
"concept_start_idx" btree (start) WHERE start IS NOT NULL
Referenced by:
TABLE "facttablemetadata" CONSTRAINT "facttablemetadata_dimension_fkey"=
FOREIGN KEY (dimension) REFERENCES concept(id)
TABLE "facttablemetadata" CONSTRAINT "facttablemetadata_member_fkey" FO=
REIGN KEY (member) REFERENCES concept(id)
TABLE "relation" CONSTRAINT "relation_child_fkey" FOREIGN KEY (child) R=
EFERENCES concept(id) DEFERRABLE
TABLE "relation" CONSTRAINT "relation_parent_fkey" FOREIGN KEY (parent)=
REFERENCES concept(id) DEFERRABLE
TABLE "term" CONSTRAINT "term_concept_id_fkey" FOREIGN KEY (concept_id)=
REFERENCES concept(id) DEFERRABLE
wdsold=3D> select start, count(*) from concept group by start order by star=
t;
start | count =20
-------+---------
t | 3
| 3431866
(2 rows)
and this query:
select id as IdValue, canonicalname as Description, null as IsLeaf from pub=
lic.concept where start;
Clearly this should be able to use the partial index (if start is true
it is also not null) and since there are only 3 out of 3 million rows in
result it would also be beneficial (and PostgreSQL 9.1 did use the
index).
However, it PostgreSQL 9.5 doesn't use it:
wdsold=3D> explain analyze select id as IdValue, canonicalname as Descripti=
on, null as IsLeaf from public.concept where start;
QUERY PLAN =
=20
---------------------------------------------------------------------------=
-----------------------------
Seq Scan on concept (cost=3D0.00..81659.03 rows=3D1 width=3D27) (actual t=
ime=3D0.026..955.889 rows=3D3 loops=3D1)
Filter: start
Rows Removed by Filter: 3431866
Planning time: 0.193 ms
Execution time: 955.926 ms
(5 rows)
Even if I try to force it:
wdsold=3D> set enable_seqscan to off;
SET
Time: 0.540 ms
wdsold=3D> explain analyze select id as IdValue, canonicalname as Descripti=
on, null as IsLeaf from public.concept where start;
QUERY PLAN =
=20
---------------------------------------------------------------------------=
---------------------------------------------
Seq Scan on concept (cost=3D10000000000.00..10000099078.69 rows=3D1 width=
=3D27) (actual time=3D0.014..948.738 rows=3D3 loops=3D1)
Filter: start
Rows Removed by Filter: 3431866
Planning time: 0.060 ms
Execution time: 948.777 ms
(5 rows)
So it obviously thinks that it can't use the index.
However, if I create a full index:
wdsold=3D> create index on concept(start);
CREATE INDEX
Time: 5899.635 ms
wdsold=3D> explain analyze select id as IdValue, canonicalname as Descripti=
on, null as IsLeaf from public.concept where start;
QUERY PLAN =
=20
---------------------------------------------------------------------------=
-------------------------------------------------
Index Scan using concept_start_idx on concept (cost=3D0.43..3.05 rows=3D1=
width=3D27) (actual time=3D0.501..0.535 rows=3D3 loops=3D1)
Index Cond: (start =3D true)
Filter: start
Planning time: 0.731 ms
Execution time: 0.577 ms
(5 rows)
it is used, and also if I create a partial index just on true values:
wdsold=3D> create index on concept(start) where start;
CREATE INDEX
Time: 937.267 ms
wdsold=3D> explain analyze select id as IdValue, canonicalname as Descripti=
on, null as IsLeaf from public.concept where start;
QUERY PLAN =
=20
---------------------------------------------------------------------------=
-------------------------------------------------
Index Scan using concept_start_idx on concept (cost=3D0.13..2.75 rows=3D1=
width=3D27) (actual time=3D0.028..0.033 rows=3D3 loops=3D1)
Index Cond: (start =3D true)
Planning time: 0.499 ms
Execution time: 0.073 ms
(4 rows)
it is also used.=20
So I think the problem is that PostgreSQL 9.5alpha1 doesn't consider
true to be a subset of the non-null values for the purpose of index
selection.
hp
--=20
_ | Peter J. Holzer | I want to forget all about both belts and
|_|_) | | suspenders; instead, I want to buy pants=20
| | | hjp@hjp.at | that actually fit.
__/ | http://www.hjp.at/ | -- http://noncombatant.org/
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes:
> Consider this table:
> ...
> "concept_start_idx" btree (start) WHERE start IS NOT NULL
> and this query:
> select id as IdValue, canonicalname as Description, null as IsLeaf from public.concept where start;
> Clearly this should be able to use the partial index (if start is true
> it is also not null)
As you surmise, there's no proof rule for that.
> and since there are only 3 out of 3 million rows in
> result it would also be beneficial (and PostgreSQL 9.1 did use the
> index).
... and there wasn't in 9.1 either. I get a seqscan from examples like
this in every branch back to 8.3, which is as far back as I can test
conveniently.
This is the exact test case I'm using:
create table foo as select null::bool as start, generate_series(1,100000) as x;
update foo set start = true where x < 10;
create index foos on foo (start) where start is not null;
analyze foo;
explain select * from foo where start;
explain select * from foo where start = true;
explain select * from foo where start is not null;
Only the last case produces use of the index. I agree that it'd be better
if they all did, but I'm disinclined to consider it a bug fix unless you
can show a specific case in which there's a performance regression from
older releases.
regards, tom lane
On 2015-07-31 20:03:41 -0400, Tom Lane wrote: > "Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > > Consider this table: > > ... > > "concept_start_idx" btree (start) WHERE start IS NOT NULL >=20 > > and this query: >=20 > > select id as IdValue, canonicalname as Description, null as IsLeaf from= public.concept where start; >=20 > > Clearly this should be able to use the partial index (if start is true > > it is also not null) >=20 > As you surmise, there's no proof rule for that. >=20 > > and since there are only 3 out of 3 million rows in result it would > > also be beneficial (and PostgreSQL 9.1 did use the index). >=20 > ... and there wasn't in 9.1 either. I get a seqscan from examples like > this in every branch back to 8.3, which is as far back as I can test > conveniently. This is weird. I do remember that I tested various indexes until I found one which was actually used on the development server (which was 9.1 at the time and upgraded to 9.5 recently). However, on the test system (still on 9.1) I can't get postgres to use the index either.=20 So I must assume that I'm either completely misremembering or that I changed the index after that for some reason I don't remember. Sorry, my bad. > Only the last case produces use of the index. I agree that it'd be better > if they all did, but I'm disinclined to consider it a bug fix unless you > can show a specific case in which there's a performance regression from > older releases. "grossly incorrect plan choices are cause for a bug report" (from http://www.postgresql.org/docs/9.5/static/indexes-partial.html) ;-) Yeah, not a regression, and even though I consider that "grossly incorrect", not high on my priority list (since the "workaround" is arguably "more correct" in my case). So please consider it a feature request instead of a bug report. hp --=20 _ | Peter J. Holzer | I want to forget all about both belts and |_|_) | | suspenders; instead, I want to buy pants=20 | | | hjp@hjp.at | that actually fit. __/ | http://www.hjp.at/ | -- http://noncombatant.org/
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > On 2015-07-31 20:03:41 -0400, Tom Lane wrote: >> Only the last case produces use of the index. I agree that it'd be better >> if they all did, but I'm disinclined to consider it a bug fix unless you >> can show a specific case in which there's a performance regression from >> older releases. > "grossly incorrect plan choices are cause for a bug report" (from > http://www.postgresql.org/docs/9.5/static/indexes-partial.html) ;-) Sorry, I phrased that poorly. I meant that I'm disinclined to change this behavior in stable branches unless it's a regression. I agree that it's reasonable to fix it in 9.5/HEAD, and I've now done so. regards, tom lane
On 2015-08-01 14:33:50 -0400, Tom Lane wrote: > "Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > > On 2015-07-31 20:03:41 -0400, Tom Lane wrote: > >> I agree that it'd be better if they all did, but I'm disinclined to > >> consider it a bug fix unless you can show a specific case in which > >> there's a performance regression from older releases. >=20 > > "grossly incorrect plan choices are cause for a bug report" (from > > http://www.postgresql.org/docs/9.5/static/indexes-partial.html) ;-) >=20 > Sorry, I phrased that poorly. I meant that I'm disinclined to change > this behavior in stable branches unless it's a regression. I agree > that it's reasonable to fix it in 9.5/HEAD, and I've now done so. Cool, thanks! hp --=20 _ | Peter J. Holzer | I want to forget all about both belts and |_|_) | | suspenders; instead, I want to buy pants=20 | | | hjp@hjp.at | that actually fit. __/ | http://www.hjp.at/ | -- http://noncombatant.org/