Обсуждение: IN operator causes sequential scan (vs. multiple OR expressions)
PostgreSQL version: 8.2.1 OS: Windows Server 2003 I have a relatively simple query where the planner chooses a sequential scan when using the IN operator but chooses an index scan when using logically equivalent multiple OR expressions. Here is the table structure and the two versions of the query: CREATE TABLE pool_sample ( id integer NOT NULL, state character varying(25) NOT NULL, not_pooled_reason character varying(25) NOT NULL, "comment" character varying(255), CONSTRAINT "pk_poolSample_id" PRIMARY KEY (id) ) WITHOUT OIDS; CREATE INDEX "idx_poolSample_state" ON pool_sample USING btree (state); The following query uses a sequential scan (i.e. filter) on the "state" column and takes about 5 seconds to execute (up to ~45 seconds with an "empty" cache): SELECT * FROM pool_sample ps WHERE ps.state IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL'); This version of the query uses an index scan on "state" and takes about 50 milliseconds: SELECT * FROM pool_sample ps WHERE ps.state = 'PENDING_REPOOL_REVIEW' OR ps.state = 'READY_FOR_REPOOL'; There are over 10 million rows in the pool_sample table and 518 rows meet the given criteria. In the first query, the planner estimates that nearly 10 million rows will be returned (i.e. almost all rows in the table). In the second query, the planner estimates 6830 rows, which seems close enough for the purposes of planning. If I explicitly cast the state column to text, the IN query uses an index scan and performs just as well as the multiple OR version: SELECT * FROM pool_sample ps WHERE ps.state::text IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL'); So it would appear that the planner automatically casts the state column to text within an OR expression but does not perform the cast in an IN expression. Our SQL is generated from an O/R mapper, so it's non-trivial (or at least undesirable) to hand tune every query like this with an explicit type cast. The only option I've come up with is to define the state column as text in the first place, thus avoiding the need to cast. Would this work? Are there any other/better options? Thanks, -Ryan
Ryan Holmes <ryan@hyperstep.com> writes: > I have a relatively simple query where the planner chooses a > sequential scan when using the IN operator but chooses an index scan > when using logically equivalent multiple OR expressions. EXPLAIN ANALYZE for both, please? If you set enable_seqscan = off, does that force an indexscan, and if so what does EXPLAIN ANALYZE show in that case? regards, tom lane
On Jan 27, 2007, at 3:53 PM, Tom Lane wrote: > Ryan Holmes <ryan@hyperstep.com> writes: >> I have a relatively simple query where the planner chooses a >> sequential scan when using the IN operator but chooses an index scan >> when using logically equivalent multiple OR expressions. > > EXPLAIN ANALYZE for both, please? > > If you set enable_seqscan = off, does that force an indexscan, and > if so > what does EXPLAIN ANALYZE show in that case? > > regards, tom lane Wow, I didn't expect such a quick response -- thank you! Note: I rebuilt the state column index and ran a VACUUM ANALYZE since my original post, so the planner's "rows" estimate is now different than the 6830 I mentioned. The planner estimate is actually *less* accurate now, but still in the ballpark relatively speaking. Here is the EXPLAIN ANALYZE for both queries with enable_seqscan = on : EXPLAIN ANALYZE SELECT * FROM pool_sample ps WHERE ps.state = 'PENDING_REPOOL_REVIEW' OR ps.state = 'READY_FOR_REPOOL'; Bitmap Heap Scan on pool_sample ps (cost=985.51..61397.50 rows=50022 width=539) (actual time=13.560..39.377 rows=518 loops=1) Recheck Cond: (((state)::text = 'PENDING_REPOOL_REVIEW'::text) OR ((state)::text = 'READY_FOR_REPOOL'::text)) -> BitmapOr (cost=985.51..985.51 rows=50084 width=0) (actual time=9.628..9.628 rows=0 loops=1) -> Bitmap Index Scan on "idx_poolSample_state" (cost=0.00..480.25 rows=25042 width=0) (actual time=0.062..0.062 rows=4 loops=1) Index Cond: ((state)::text = 'PENDING_REPOOL_REVIEW'::text) -> Bitmap Index Scan on "idx_poolSample_state" (cost=0.00..480.25 rows=25042 width=0) (actual time=9.563..9.563 rows=514 loops=1) Index Cond: ((state)::text = 'READY_FOR_REPOOL'::text) Total runtime: 39.722 ms EXPLAIN ANALYZE SELECT * FROM pool_sample ps WHERE ps.state IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL'); Seq Scan on pool_sample ps (cost=0.00..331435.92 rows=9667461 width=539) (actual time=1060.472..47584.542 rows=518 loops=1) Filter: ((state)::text = ANY (('{PENDING_REPOOL_REVIEW,READY_FOR_REPOOL}'::character varying [])::text[])) Total runtime: 47584.698 ms And now with enable_seqscan = off: EXPLAIN ANALYZE SELECT * FROM pool_sample ps WHERE ps.state = 'PENDING_REPOOL_REVIEW' OR ps.state = 'READY_FOR_REPOOL'; Bitmap Heap Scan on pool_sample ps (cost=985.51..61397.50 rows=50022 width=539) (actual time=0.324..0.601 rows=518 loops=1) Recheck Cond: (((state)::text = 'PENDING_REPOOL_REVIEW'::text) OR ((state)::text = 'READY_FOR_REPOOL'::text)) -> BitmapOr (cost=985.51..985.51 rows=50084 width=0) (actual time=0.287..0.287 rows=0 loops=1) -> Bitmap Index Scan on "idx_poolSample_state" (cost=0.00..480.25 rows=25042 width=0) (actual time=0.109..0.109 rows=4 loops=1) Index Cond: ((state)::text = 'PENDING_REPOOL_REVIEW'::text) -> Bitmap Index Scan on "idx_poolSample_state" (cost=0.00..480.25 rows=25042 width=0) (actual time=0.176..0.176 rows=514 loops=1) Index Cond: ((state)::text = 'READY_FOR_REPOOL'::text) Total runtime: 0.779 ms EXPLAIN ANALYZE SELECT * FROM pool_sample ps WHERE ps.state IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL'); Bitmap Heap Scan on pool_sample ps (cost=150808.51..467822.04 rows=9667461 width=539) (actual time=0.159..0.296 rows=518 loops=1) Recheck Cond: ((state)::text = ANY (('{PENDING_REPOOL_REVIEW,READY_FOR_REPOOL}'::character varying [])::text[])) -> Bitmap Index Scan on "idx_poolSample_state" (cost=0.00..148391.65 rows=9667461 width=0) (actual time=0.148..0.148 rows=518 loops=1) Index Cond: ((state)::text = ANY (('{PENDING_REPOOL_REVIEW,READY_FOR_REPOOL}'::character varying [])::text[])) Total runtime: 0.445 ms So, yes, disabling seqscan does force an index scan for the IN version. My question now is, how do I get PostgreSQL to make the "right" decision without disabling seqscan? Here are the non-default resource usage and query tuning settings from postgresql.conf: shared_buffers = 512MB work_mem = 6MB maintenance_work_mem = 256MB random_page_cost = 3.0 effective_cache_size = 1536MB from_collapse_limit = 12 join_collapse_limit = 12 The server has 4GB RAM, 2 X 2.4GHz Opteron dual core procs, 5 x 15k RPM disks in a RAID 5 array and runs Windows Server 2003 x64. Thanks, -Ryan
Ryan Holmes <ryan@hyperstep.com> writes: > So, yes, disabling seqscan does force an index scan for the IN > version. My question now is, how do I get PostgreSQL to make the > "right" decision without disabling seqscan? I pinged you before because in a trivial test case I got indexscans out of both text and varchar cases: regression=# create table foo (f1 text unique, f2 varchar(25) unique); NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_f1_key" for table "foo" NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_f2_key" for table "foo" CREATE TABLE regression=# explain select * from foo where f1 in ('foo', 'bar'); QUERY PLAN ------------------------------------------------------------------------- Bitmap Heap Scan on foo (cost=4.52..9.86 rows=2 width=61) Recheck Cond: (f1 = ANY ('{foo,bar}'::text[])) -> Bitmap Index Scan on foo_f1_key (cost=0.00..4.52 rows=2 width=0) Index Cond: (f1 = ANY ('{foo,bar}'::text[])) (4 rows) regression=# explain select * from foo where f2 in ('foo', 'bar'); QUERY PLAN ------------------------------------------------------------------------------------- Bitmap Heap Scan on foo (cost=6.59..17.27 rows=10 width=61) Recheck Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[])) -> Bitmap Index Scan on foo_f2_key (cost=0.00..6.59 rows=10 width=0) Index Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[])) (4 rows) But on closer inspection the second case is not doing the right thing: notice the rowcount estimate is 10, whereas it should be only 2 because of the unique index on f2. I poked into it and realized that in 8.2 scalararraysel() fails to deal with binary-compatible datatype cases, instead falling back to a not-very-bright generic estimate. I've committed a fix for 8.2.2, but in the meantime maybe you could change your varchar column to text? regards, tom lane
On Jan 27, 2007, at 5:56 PM, Tom Lane wrote: > I've committed a fix for 8.2.2, but in the meantime maybe you could > change your varchar column to text? > > regards, tom lane Thank you for the help and the fix. We're just performance testing right now so minor data model changes are no problem. Thanks, -Ryan