On Mon, Dec 02, 2002 at 05:10:00PM -0500, David F. Skoll wrote:
> Hi,
>
> I have a table with a column called "state". Each row can be in one
> of four states, let's call them 'new', 'pending', 'ok', and 'bad'.
> On average, about 95% of the rows will be 'bad', with the remaining
> 5% being in one of the other three states. If the table has 50K rows
> and I just want to pull out the 'ok' rows, I don't want to do a sequential
> scan. To pull out the 'bad' rows, obviously, sequential scan is fine.
>
> I've heard that a btree index performs badly in this situation. Is
> a hash index appropriate? I've heard bad things about hash indexes in
> PostgreSQL.
create table states (id serial primary key, state varchar(10), t text );
create function fill_states(varchar, int) returns bool as '
begin for i in 1 .. $2
loop
insert into states (state, t) values ($1, ''random'');
end loop;
return true;
end;
' language plpgsql;
select fill_states('ok',45000);
select fill_states('bad',5000);
select fill_states('warning',5000);
analyze states;
joel@joel=# explain select * from states where state='warning';
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using state_idx on states (cost=0.00..1013.00 rows=5533 width=20)
Index Cond: (state = 'warning'::character varying)
(2 rows)
joel@joel=# explain select * from states where state='ok';
QUERY PLAN
--------------------------------------------------------------
Seq Scan on states (cost=0.00..1171.51 rows=44627 width=20)
Filter: (state = 'ok'::character varying)
(2 rows)
Looks right to me: index scan for the less-common option, seqscan for
the most common. Why don't you think this, as a btree, will work for
you?
--
Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Independent Knowledge Management Consultant