Обсуждение: indexes not being used!
We have a query that is causing performance problems. The indexes do not appear to be used despite the fact that they exist. I dropped the table and recreated from scratch. I reindexed as well and still no luck. We vacuum analyze the system often. I will do my best to list all that is happening. Any assistance would be greatly appreciated. If you can offer some insight as to what the explains are telling me that would be so helpful. Is there some documentation somewhere that discusses such results?
Thanks a lot!
Jodi Kanter
This is the table:
genex=# \d groupref
Table "groupref"
Column | Type | Modifiers
-------------+---------+-----------
us_fk | integer | not null
gs_fk | integer | not null
ref_fk | integer | not null
permissions | integer |
Indexes: groupref_gs_fk_ind,
groupref_permissions,
groupref_us_fk_ind
Unique keys: groupref_ref_fk_ind
Table "groupref"
Column | Type | Modifiers
-------------+---------+-----------
us_fk | integer | not null
gs_fk | integer | not null
ref_fk | integer | not null
permissions | integer |
Indexes: groupref_gs_fk_ind,
groupref_permissions,
groupref_us_fk_ind
Unique keys: groupref_ref_fk_ind
This is the original query and it's explain results:
genex=# explain select am_pk, smp_fk, am_comments, hybridization_name from arraymeasurement, groupref, grouplink where (groupref.ref_fk=arraymeasurement.am_pk and ((groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0) or (groupref.gs_fk=grouplink.gs_fk and grouplink.us_fk=1 and (groupref.permissions&16)>0 )));
NOTICE: QUERY PLAN:
NOTICE: QUERY PLAN:
Nested Loop (cost=5.88..350.58 rows=1 width=55)
-> Hash Join (cost=5.88..47.57 rows=42 width=47)
-> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16)
-> Hash (cost=5.50..5.50 rows=150 width=31)
-> Seq Scan on arraymeasurement (cost=0.00..5.50 rows=150 width=31)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
-> Hash Join (cost=5.88..47.57 rows=42 width=47)
-> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16)
-> Hash (cost=5.50..5.50 rows=150 width=31)
-> Seq Scan on arraymeasurement (cost=0.00..5.50 rows=150 width=31)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
Why is it ignoring indexes? There is a double join between groupref and grouplink. I thought this could be the trouble but you'll se below that I started to simplify the query below and still have trouble.
Next I eliminated a join to the main data table and got this:
genex=# explain select ref_fk from groupref, grouplink where ((groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0) or (groupref.gs_fk=grouplink.gs_fk and grouplink.us_fk=1 and (groupref.permissions&16)>0 ));
NOTICE: QUERY PLAN:
NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..2303.34 rows=1 width=24)
-> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
-> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
I thought that the OR part of the statement might be causing trouble so I eliminated it and got:
genex=# explain select ref_fk from groupref, grouplink where (groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0);
NOTICE: QUERY PLAN:
NOTICE: QUERY PLAN:
Hash Join (cost=18.71..20.89 rows=1 width=20)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
-> Hash (cost=18.71..18.71 rows=2 width=12)
-> Index Scan using groupref_us_fk_ind on groupref (cost=0.00..18.71 rows=2 width=12)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
-> Hash (cost=18.71..18.71 rows=2 width=12)
-> Index Scan using groupref_us_fk_ind on groupref (cost=0.00..18.71 rows=2 width=12)
Now an index is being used??
I did a vaccum analyze again and ran the same query. The explain results changed to this:
Merge Join (cost=34.96..35.86 rows=32 width=20)
-> Sort (cost=30.72..30.72 rows=95 width=12)
-> Seq Scan on groupref (cost=0.00..27.62 rows=95 width=12)
-> Sort (cost=4.23..4.23 rows=78 width=8)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
-> Sort (cost=30.72..30.72 rows=95 width=12)
-> Seq Scan on groupref (cost=0.00..27.62 rows=95 width=12)
-> Sort (cost=4.23..4.23 rows=78 width=8)
-> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8)
I backed off to a very simple query and still cannot get the indexes working.
genex=# explain select ref_fk from groupref where (groupref.us_fk=1 and (groupref.permissions&128)>0);
NOTICE: QUERY PLAN:
NOTICE: QUERY PLAN:
Seq Scan on groupref (cost=0.00..27.62 rows=95 width=4)
EXPLAIN
genex=# explain select ref_fk from groupref where (groupref.us_fk=1);
NOTICE: QUERY PLAN:
genex=# explain select ref_fk from groupref where (groupref.us_fk=1);
NOTICE: QUERY PLAN:
Seq Scan on groupref (cost=0.00..22.01 rows=284 width=4)
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu
Jodi Kanter wrote: > We have a query that is causing performance problems. The indexes do > not appear to be used despite the fact that they exist. I dropped the > table and recreated from scratch. I reindexed as well and still no > luck. We vacuum analyze the system often. I will do my best to list > all that is happening. Any assistance would be greatly appreciated. > If you can offer some insight as to what the explains are telling me > that would be so helpful. Is there some documentation somewhere that > discusses such results? Thanks a lot! Jodi Kanter It's hard to tell from what you did post, but if the query returns a significant portion of the table then a seq scan is faster, and is properly picked by the optimizer. Please post EXPLAIN ANALYZE results. (and actually, this thread probably should be on the SQL or the PERFORM lists, not this one) Joe
I apologize if this is the wrong list. I have posted explain analyzes below. If you have time and can assist I would appreciate it. I will look into the other lists that you mentioned. Thanks Jodi explain analyze select am_pk, smp_fk, am_comments, hybridization_name from arraymeasurement, groupref, grouplink where (groupref.ref_fk=arraymeasurement.am_pk and ((groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0) or (groupref.gs_fk=grouplink.gs_fk and grouplink.us_fk=1 and (groupref.permissions&16)>0 ))); NOTICE: QUERY PLAN: Nested Loop (cost=5.88..350.58 rows=1 width=55) (actual time=2.37..73.58 rows=43 loops=1) -> Hash Join (cost=5.88..47.57 rows=42 width=47) (actual time=2.28..12.12 rows=43 loops=1) -> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16) (actual time=0.05..8.32 rows=275 loops=1) -> Hash (cost=5.50..5.50 rows=150 width=31) (actual time=2.06..2.06 rows=0 loops=1) -> Seq Scan on arraymeasurement (cost=0.00..5.50 rows=150 width=31) (actual time=0.04..1.45 rows=150 loops=1) -> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8) (actual time=0.01..0.44 rows=78 loops=43) Total runtime: 74.00 msec explain analyze select ref_fk from groupref, grouplink where ((groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0) or (groupref.gs_fk=grouplink.gs_fk and grouplink.us_fk=1 and (groupref.permissions&16)>0 )); NOTICE: QUERY PLAN: Nested Loop (cost=0.00..2303.34 rows=1 width=24) (actual time=0.15..340.53 rows=272 loops=1) -> Seq Scan on groupref (cost=0.00..38.83 rows=313 width=16) (actual time=0.07..9.12 rows=275 loops=1) -> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8) (actual time=0.01..0.42 rows=78 loops=275) Total runtime: 341.30 msec explain analyze select ref_fk from groupref, grouplink where (groupref.us_fk=1 and groupref.us_fk=grouplink.us_fk and grouplink.gs_fk=groupref.gs_fk and (groupref.permissions&128)>0); NOTICE: QUERY PLAN: Merge Join (cost=34.96..35.86 rows=32 width=20) (actual time=8.28..12.60 rows=252 loops=1) -> Sort (cost=30.72..30.72 rows=95 width=12) (actual time=7.17..7.59 rows=252 loops=1) -> Seq Scan on groupref (cost=0.00..27.62 rows=95 width=12) (actual time=0.05..4.87 rows=252 loops=1) -> Sort (cost=4.23..4.23 rows=78 width=8) (actual time=1.08..1.56 rows=299 loops=1) -> Seq Scan on grouplink (cost=0.00..1.78 rows=78 width=8) (actual time=0.02..0.46 rows=78 loops=1) Total runtime: 13.30 msec ----- Original Message ----- From: "Joe Conway" <mail@joeconway.com> To: "Jodi Kanter" <jkanter@virginia.edu> Cc: "Postgres Admin List" <pgsql-admin@postgresql.org> Sent: Friday, March 21, 2003 10:44 AM Subject: Re: [ADMIN] indexes not being used! > Jodi Kanter wrote: > > We have a query that is causing performance problems. The indexes do > > not appear to be used despite the fact that they exist. I dropped the > > table and recreated from scratch. I reindexed as well and still no > > luck. We vacuum analyze the system often. I will do my best to list > > all that is happening. Any assistance would be greatly appreciated. > > If you can offer some insight as to what the explains are telling me > > that would be so helpful. Is there some documentation somewhere that > > discusses such results? Thanks a lot! Jodi Kanter > > It's hard to tell from what you did post, but if the query returns a > significant portion of the table then a seq scan is faster, and is > properly picked by the optimizer. Please post EXPLAIN ANALYZE results. > > (and actually, this thread probably should be on the SQL or the PERFORM > lists, not this one) > > Joe > > >
Joe Conway <mail@joeconway.com> writes: > It's hard to tell from what you did post, but if the query returns a > significant portion of the table then a seq scan is faster, and is > properly picked by the optimizer. Please post EXPLAIN ANALYZE results. And do a VACUUM ANALYZE beforehand. The explain estimates look like the planner thinks all these tables are tiny. > (and actually, this thread probably should be on the SQL or the PERFORM > lists, not this one) Right. regards, tom lane
On Friday 21 March 2003 17:07, Jodi Kanter wrote: Hi, maybe you try to change the parameter in postgresql.conf and try again. There is also a way to set it at the psql command interface enable_seqscan = false regards -andreas -- Andreas Schmitz - Phone +49 201 8501 318 Cityweb-Technik-Service-Gesellschaft mbH Friedrichstr. 12 - Fax +49 201 8501 104 45128 Essen - email a.schmitz@cityweb.de
Jodi Kanter <jkanter@virginia.edu> writes: > I apologize if this is the wrong list. I have posted explain analyzes below. Given the small size of the tables, I think the planner is doing the right thing not to use indexes. Seqscans are probably less I/O until the tables get a lot bigger. The bitwise-and tests are not indexable anyway... you might want to reconsider that data representation, if you were hoping that the bit conditions would be useful for indexing. regards, tom lane
Tom Lane wrote: > Jodi Kanter <jkanter@virginia.edu> writes: > >>I apologize if this is the wrong list. I have posted explain analyzes below. > > > Given the small size of the tables, I think the planner is doing the > right thing not to use indexes. Seqscans are probably less I/O until > the tables get a lot bigger. If you want to convince yourself of this, try EXPLAIN ANALYZE on your query with enable_seqscan set to off (discard the first result however, because in either case some caching will happen). For example: regression=# select * from foo; f0 | f1 | f2 ----+------+------- 1 | cat1 | 1.21 2 | cat1 | 1.24 3 | cat1 | 1.18 4 | cat1 | 1.26 5 | cat1 | 1.15 6 | cat2 | 1.15 7 | cat2 | 1.26 8 | cat2 | 1.32 9 | cat2 | 1.3 10 | cat3 | 3.333 (10 rows) regression=# VACUUM ANALYZE; VACUUM regression=# create index foo_idx on foo(f0); CREATE INDEX regression=# explain analyze select * from foo where f0 = 1; QUERY PLAN -------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..1.12 rows=2 width=20) (actual time=0.03..0.05 rows=1 loops=1) Filter: (f0 = 1) Total runtime: 0.22 msec (3 rows) regression=# set enable_seqscan to off; SET regression=# explain analyze select * from foo where f0 = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Index Scan using foo_idx on foo (cost=0.00..3.01 rows=2 width=20) (actual time=0.06..0.07 rows=1 loops=1) Index Cond: (f0 = 1) Total runtime: 0.20 msec (3 rows) regression=# set enable_seqscan to on; SET regression=# explain analyze select * from foo where f0 = 1; QUERY PLAN -------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..1.12 rows=2 width=20) (actual time=0.03..0.05 rows=1 loops=1) Filter: (f0 = 1) Total runtime: 0.14 msec (3 rows) HTH, Joe
We just finished doing this and the queries were slower with the indexes being used. Apparently the optimizer knows what it is doing. The root of the problem goes back to a very slow link on a web page. Of course the programmer has stated that the problem is on the postgres side but I have yet to find one. Thanks Jodi ----- Original Message ----- From: "Joe Conway" <mail@joeconway.com> To: "Tom Lane" <tgl@sss.pgh.pa.us> Cc: "Jodi Kanter" <jkanter@virginia.edu>; "Postgres Admin List" <pgsql-admin@postgresql.org> Sent: Friday, March 21, 2003 1:12 PM Subject: Re: [ADMIN] indexes not being used! > Tom Lane wrote: > > Jodi Kanter <jkanter@virginia.edu> writes: > > > >>I apologize if this is the wrong list. I have posted explain analyzes below. > > > > > > Given the small size of the tables, I think the planner is doing the > > right thing not to use indexes. Seqscans are probably less I/O until > > the tables get a lot bigger. > > If you want to convince yourself of this, try EXPLAIN ANALYZE on your > query with enable_seqscan set to off (discard the first result however, > because in either case some caching will happen). For example: > > regression=# select * from foo; > f0 | f1 | f2 > ----+------+------- > 1 | cat1 | 1.21 > 2 | cat1 | 1.24 > 3 | cat1 | 1.18 > 4 | cat1 | 1.26 > 5 | cat1 | 1.15 > 6 | cat2 | 1.15 > 7 | cat2 | 1.26 > 8 | cat2 | 1.32 > 9 | cat2 | 1.3 > 10 | cat3 | 3.333 > (10 rows) > > regression=# VACUUM ANALYZE; > VACUUM > regression=# create index foo_idx on foo(f0); > CREATE INDEX > > regression=# explain analyze select * from foo where f0 = 1; > QUERY PLAN > -------------------------------------------------------------------------- ------------------ > Seq Scan on foo (cost=0.00..1.12 rows=2 width=20) (actual > time=0.03..0.05 rows=1 loops=1) > Filter: (f0 = 1) > Total runtime: 0.22 msec > (3 rows) > regression=# set enable_seqscan to off; > SET > regression=# explain analyze select * from foo where f0 = 1; > QUERY PLAN > -------------------------------------------------------------------------- ---------------------------------- > Index Scan using foo_idx on foo (cost=0.00..3.01 rows=2 width=20) > (actual time=0.06..0.07 rows=1 loops=1) > Index Cond: (f0 = 1) > Total runtime: 0.20 msec > (3 rows) > > regression=# set enable_seqscan to on; > SET > regression=# explain analyze select * from foo where f0 = 1; > QUERY PLAN > -------------------------------------------------------------------------- ------------------ > Seq Scan on foo (cost=0.00..1.12 rows=2 width=20) (actual > time=0.03..0.05 rows=1 loops=1) > Filter: (f0 = 1) > Total runtime: 0.14 msec > (3 rows) > > HTH, > > Joe > >