Обсуждение: how to enforce index sub-select over filter+seqscan
Hello! I have this table: create table test ( s1 varchar(255), s2 varchar(255), i1 integer, i2 integer, ... over 100 other fields ); table contains over 8 million records there's these indexes: create index is1 on test (s1); create index is2 on test (s2); create index ii1 on test (i1); create index ii2 on test (i2); create index ii3 on test (i1, i2); and then i run this query: select * from ( select * from test where is1 = 'aa' or is2 = 'aa' ) where is1 = 1 or (is1 = 1 and is2 = 1) or (is1 = 2 and is2 = 2) or (is1 = 3 and is2 = 3) where part of outer query can have different count of "or (is1 = N and is2 = M)" expressions, lets name this number X. When X is low planner chooses index scan using is1 and is2, then BitmapAnd that with index scan using ii1, ii2 or ii3. But when X is big enough (> 15) planner chooses seqscan and filter on i1, i2, s1, s2. Seqscan is very slow and I want to avoid it. Subquery is very fast and i don't know why postgres chooses that plan. I know I can set enable_seqscan = off. Is there other ways to enforce index usage? postgres pg_class have right estimate of rowcount. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail?
Dmitry Teslenko <dteslenko@gmail.com> wrote: > Seqscan is very slow and I want to avoid it. Subquery is very fast > and i don't know why postgres chooses that plan. > > I know I can set enable_seqscan = off. > Is there other ways to enforce index usage? If you come at it from that angle, you probably won't get the best resolution. PostgreSQL can see the alternative plans, and develops estimated costs of running each. It uses the one that it thinks will be fastest. If it's wrong, there's probably something wrong with the statistics it uses for estimating, or with the costing information. (There are some cases where it's not able to accurately estimate costs even if these are right, but let's check the more common cases first.) Please provide a little more information, like PostgreSQL version, the postgresql.conf contents (excluding comments), OS, hardware, and the EXPLAIN ANALYZE output of the query with and without enable_seqscan = off. Other useful ideas here: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin
On Thu, Sep 23, 2010 at 10:26 AM, Dmitry Teslenko <dteslenko@gmail.com> wrote: > Hello! > > I have this table: > > create table test ( > s1 varchar(255), > s2 varchar(255), > i1 integer, > i2 integer, > > ... over 100 other fields > > ); > > table contains over 8 million records > > there's these indexes: > > create index is1 on test (s1); > create index is2 on test (s2); > create index ii1 on test (i1); > create index ii2 on test (i2); > create index ii3 on test (i1, i2); > > and then i run this query: > > select > * > from ( > select * > from test > where > is1 = 'aa' or is2 = 'aa' > ) > where > is1 = 1 > or (is1 = 1 > and is2 = 1) > or (is1 = 2 > and is2 = 2) > or (is1 = 3 > and is2 = 3) hm, I think you meant to say: s1 = 'aa' or s2 = 'aa', i1 = 1 ... etc. details are important! Consider taking the combination of 'correct' pair of i1 and i2 and building a table with 'values' and joining to that: select * from test join ( values (2,2), (3,3), ... ) q(i1, i2) using(i1,i2) where s1 = 'aa' or s2 = 'aa' or i1=1 merlin
On Thu, Sep 23, 2010 at 10:26 AM, Dmitry Teslenko <dteslenko@gmail.com> wrote: > I know I can set enable_seqscan = off. > Is there other ways to enforce index usage? Not really, but I suspect random_page_cost and seq_page_cost might help the planner make better decisions. Is your data by any chance mostly cached in memory? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company