Обсуждение: Very long time to execute and Update, suggestions?
Hi all,
I have a table with a little over 200,000 columns in it that I need
to update using a regular expression. I admit that though I am not a
beginner and postgres, I am also far from an expert. :p
I tried to create an Index that would optimize the UPDATE but I may
have made an error in how I built it. Here is the table structure, the
index I tried to create and an 'EXPLAIN ANALYZE' of the UPDATE (though I
am still just now learning how to use 'EXPLAIN').
tle-bu=> \d file_info_3
Table "public.file_info_3"
Column | Type | Modifiers
-----------------+----------------------+-----------------------------------------
file_group_name | text | not null
file_group_uid | bigint | not null
file_mod_time | bigint | not null
file_name | text | not null
file_parent_dir | text | not null
file_perm | text | not null
file_size | bigint | not null
file_type | character varying(2) | not null default
'f'::character varying
file_user_name | text | not null
file_user_uid | bigint | not null
file_backup | boolean | not null default true
file_display | boolean | not null default false
file_restore | boolean | not null default false
Indexes:
"file_info_3_display_idx" btree (file_type, file_parent_dir, file_name)
Here is the EXPLAIN:
tle-bu=> EXPLAIN ANALYZE UPDATE file_info_3 SET file_backup='f' WHERE
file_parent_dir~'^/home' OR (file_parent_dir='/' AND file_name='home');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on file_info_3 (cost=0.00..7770.00 rows=1006 width=206)
(actual time=1050.813..5648.462 rows=67835 loops=1)
Filter: ((file_parent_dir ~ '^/home'::text) OR ((file_parent_dir =
'/'::text) AND (file_name = 'home'::text)))
Total runtime: 68498.898 ms
(3 rows)
I thought that it would have used the index because 'file_parent_dir'
and 'file_name' are in the index but is I am reading the "EXPLAIN"
output right it isn't but is instead doing a sequencial scan. If that is
the case, how would I best built the index? Should I have just used the
'file_parent_dir' and 'file_name'?
Thanks all!!
Madison
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison Kelly (Digimer)
TLE-BU, The Linux Experience; Back Up
http://tle-bu.thelinuxexperience.com
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Philip Hallstrom wrote: > I'm not sure about this which is why I'm replying off list, but your > index is on file_type, file_parent_dir, and file_name and you're query > is on file_parent_dir and file_name. > > I seem to remember reading that that the index will only get used if the > columns in the where clause "match up" "in order". > > That is um... if you have an index on columns a and b and a where clause > of "b = 1" it woin't use the index since the index "looks like" > > a, b > a, b > a, b > etc... > > Does that make any sense? Not sure if that's right or not, but easy > enough to remove the "file_type" from your index and try it. > > post back to the list if that's it. > > -philip Thanks for the reply! I have played around a little more and have created a few different test Indexes and it looks like it is the regex that is causing it to do the sequential scan. If I remove the regex and create a 'file_parent_dir', 'file_name' index it will use it. If I create an Index just for 'file_parent_dir' and change my UPDATE to just look for the regex '... WHERE file_parent_dir~'^/<dir>'...' it will still do the sequential scan anyway. So I need to either find an Index that will work with regexes or re-write my code to update each subdirectory separately and use simpler UPDATE statement for each. Thanks again! Madison PS - I cc'ed the list to follow up on what I found out so far. (Hi list!) -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU, The Linux Experience; Back Up http://tle-bu.thelinuxexperience.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> So I need to either find an Index that will work with regexes or
> re-write my code to update each subdirectory separately and use simpler
> UPDATE statement for each.
Why don't you use a LTREE type to model your directory tree ? It's been
designed specifically for this purpose and has indexed regular expression
search.
http://www.sai.msu.su/~megera/postgres/gist/ltree/
http://www.sai.msu.su/~megera/postgres/gist/