Re: index problem
От | CoL |
---|---|
Тема | Re: index problem |
Дата | |
Msg-id | 3BCD9582.1070804@mportal.hu обсуждение исходный текст |
Ответ на | Re: index problem (Stephan Szabo <sszabo@megazone23.bigpanda.com>) |
Список | pgsql-sql |
Hi, Stephan Szabo wrote: > On Tue, 16 Oct 2001, CoL wrote: > > >>--------------------------- >>The 2 table query, where prog_data has ~8800 rowsn and index on prog_id: >>bash-2.04$ time echo "explain select distinct >>prog_id,prog_ftype,prog_fcasthour,prog_date from prog_dgy_xy,prog_data >>where pxygy_pid=prog_id " | psql -Uuser db >>NOTICE: QUERY PLAN: >> >>Unique (cost=7432549.69..7680455.07 rows=2479054 width=32) >> -> Sort (cost=7432549.69..7432549.69 rows=24790538 width=32) >> -> Merge Join (cost=148864.65..161189.33 rows=24790538 width=32) >> -> Index Scan using prog_data_pkey on prog_data >>(cost=0.00..701.12 rows=8872 width=28) >> -> Sort (cost=148864.65..148864.65 rows=921013 width=4) >> -> Seq Scan on prog_dgy_xy (cost=0.00..30145.13 >>rows=921013 width=4) >> > > I'm guessing that the approximately 25 million row estimate on the join > has to be wrong as well given that prog_data.prog_id should be unique. > > Hmm, does the explain change if you vacuum analyze the other table > (prog_data)? If not, what does explain show if you do a > set enable_seqscan='off'; > before it? The result: db=>set enable_seqscan='off'; db=>explain select distinct prog_id,prog_ftype,prog_fcasthour,prog_date from prog_dgy_xy,prog_data where pxygy_pid=prog_id; NOTICE: QUERY PLAN: Unique (cost=7606982.10..7854887.48 rows=2479054 width=32) -> Sort (cost=7606982.10..7606982.10 rows=24790538 width=32) -> Merge Join (cost=0.00..335621.73 rows=24790538 width=32) -> Index Scan using progdgyxy_idx2on prog_dgy_xy (cost=0.00..323297.05 rows=921013 width=4) -> Index Scan using prog_data_pkey on prog_data (cost=0.00..701.12 rows=8872 width=28) It "seems" index is used, but the same result :(((, and bigger execution time: real 3m41.830s What is in tables? prog_data contains unique id and other info. prog_dgy_xy contains that id with x,y coordinates (so many ids from prog_data with unique x,y) #prog_data: #prog_id, prog_ftype, prog_fcasthour, prog_date #1 'type' 6 2001-10-14 12:00:00 #2 'type' 12 2001-10-14 12:00:00 #prog_dgy_xy: #pxygy_pid, pxygy_x, pxygy_y #1 0.1 0.1 #1 0.1 0.15 How can this query takes real 0m1.755s for mysql, [17 sec for oracle], and 2-3 minutes!! for postgres? And why: POSTGRES: set enable_seqscan ='off'; select count(*) from prog_dgy_xy where pxygy_pid<13161; count -------- 900029 real 2m34.340s explain: Aggregate (cost=327896.89..327896.89 rows=1 width=0) -> Index Scan using progdgyxy_idx2 on prog_dgy_xy (cost=0.00..325594.54 rows=920940 width=0) MYSQL: select count(pxygy_pid) from PROG_DGY_XY where pxygy_pid<13161 count(pxygy_pid) 900029 real 0m27.878s explain: table type possible_keys key key_len ref rows Extra PROG_DGY_XY range progdgyxy_idx1,progdgyxy_idx2 progdgyxy_idx2 4 NULL 906856 where used; Using index The same time difference in case of: = or >, however explain says, cause seq scan is off, the index is used. I did vacuum, and vacuum analyze too before. PS: I think i have to make a site for that, cause there are many questions :), and weird things. I love postgres but this makes me "hm?". Today i'll make these test under 7.1.2. thx CoL
В списке pgsql-sql по дате отправления: