Re: slow plan on join when adding where clause
От | Mladen Gogala |
---|---|
Тема | Re: slow plan on join when adding where clause |
Дата | |
Msg-id | 4BB23DB6.6080103@vmsinfo.com обсуждение исходный текст |
Ответ на | slow plan on join when adding where clause ("L. Loewe" <lloewe@hotmail.com>) |
Список | pgsql-novice |
There is an old trick which can help you here. Try doing this: SELECT * FROM events, main WHERE main.ti > events.start and main.ti < events.stop and sensor_id+0=1; That will stop planner from merging two indexes and using bitmap. L. Loewe wrote: > Hi all > > > I have a fairly large table ("main" ~50M rows) containing a timestamp > (indexed), > a sensor_id (indexed) and some sensor data ranging over a few years. > > The second table ("events" <500 rows) contains a set of interesting > events with > an events.id and two timestamps: events.start and events.stop plus some > additional data. > > Now I want to join these tables to get the sensor data for those events. > > The interval between start and stop is quite short for each event > (usually a > couple of minutes) so that there aren't too many rows from table "main" > matching this criteria (~1K) for each event: > > SELECT * FROM events, main WHERE main.ti > events.start and main.ti < > events.stop; > > EXPLAIN ANALYZE gives > > Nested Loop (cost=0.00..27621542.27 rows=524505120 width=68) > (actual time=0.038..42.314 rows=69209 loops=1) > -> Seq Scan on events (cost=0.00..2.24 rows=124 width=28) > (actual time=0.006..0.025 rows=124 loops=1) > -> Index Scan using idx_main_ti on main (cost=0.00..159306.16 > rows=4229880 width=40) > (actual time=0.016..0.178 rows=558 loops=124) > Index Cond: ((main.ti > events.start) AND (main.ti < > events.stop)) > Total runtime: 47.682 ms > > So far so good, however if I add a sensor_id constraint the planner > chooses a > different approach: > > > SELECT * FROM events, main WHERE main.ti > events.start and > main.ti < events.stop and > sensor_id=1; > > Nested Loop (cost=7309.32..1422246.30 rows=4795865 width=68) > (actual time=23427.599..23886.276 rows=772 loops=1) > Join Filter: ((main.ti > events.start) AND (main.ti < events.stop)) > -> Bitmap Heap Scan on main (cost=7306.96..343174.23 rows=348087 > width=40) > (actual time=3771.719..9508.728 rows=490984 loops=1) > Recheck Cond: (sensor_id= 1) > -> Bitmap Index Scan on idx_main_sens (cost=0.00..7219.94 > rows=348087 width=0) > (actual time=3769.075..3769.075 rows=491102 loops=1) > Index Cond: (sensor_id= 1) > -> Materialize (cost=2.36..3.60 rows=124 width=28) > (actual time=0.000..0.010 rows=124 loops=490984) > -> Seq Scan on events (cost=0.00..2.24 rows=124 width=28) > (actual time=0.005..0.021 rows=124 loops=1) > Total runtime: 23886.494 ms > > > Apparently the planner assumes that sensor_id=1 will return fewer rows > than > the time constraint while the opposite is true: > sensor_id=1 -> ~ 500K , time window -> ~ 1K. > > Is there a way to hint the planner to use plan 1 even with the > sensor_id=1 > clause or am I doing something fundamentally wrong here? > > Thanks > -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 http://www.vmsinfo.com The Leader in Integrated Media Intelligence Solutions
В списке pgsql-novice по дате отправления: