I have a table, "path", which is:
pathid | integer | not null default
nextval('path_pathid_seq'::regclass)path | text | not null
Indexes: "path_pkey" PRIMARY KEY, btree (pathid) "path_name_idx" btree (path)
The table contains approx 1.2 million rows, of which all are unique.
(both the path field and the naem field are unique, thought he
path_name_idx index is not a unique index)
On this table, I do a query like:
SELECT * FROM path WHERE path LIKE 'f:/userdirs/s/super_73/%'
The estimate for this query is comlpetely off, which I beleive is the
cause for a very bad selection of a query plan when it's used in a big
join (creating nestloops that ends up taking 15+ minutes to complete..).
Explain analyze gives: QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------Index Scan using path_name_idx on path (cost=0.00..3.24 rows=1
width=74) (actual time=0.035..0.442 rows=214 loops=1) Index Cond: ((path >= 'f:/userdirs/s/super'::text) AND (path <
'f:/userdirs/s/supes'::text)) Filter: (path ~~ 'f:/userdirs/s/super_73%'::text)
No matter what I search on (when it's very selective), the estimate is
always 1 row, whereas the actual value is at least a couple of hundred.
If I try with say "f:/us", the difference is 377,759 estimated vs
562,459 returned, which is percentage-wise a lot less, but...
I have tried upping the statistics target up to 1000, with no changes.
Any way to teach the planner about this?
FYI, if I push the result of the select on path into a temp table, and
then join with that one in my main table, I get a hashjoin instead, and
query time is < 30 seconds instead of 15 minutes...
//Magnus