Slow query on join with Date >=

Поиск

Список

Период

Сортировка

От	Jim Treinen
Тема	Slow query on join with Date >=
Дата	30 января 2014 г. 05:10:59
Msg-id	CAGtdQrkLhpk7hJ0P-i9m3A5hHVXgZdYSC4LeqSCxPJzhnPuUmg@mail.gmail.com обсуждение исходный текст
Список	pgsql-performance

Дерево обсуждения

I have a performance problem using a dimensional model where the date is specified in a DATE dimension, specifically when using 'WHERE DATE >= 'Some Date'

This query runs very fast when using an equality expression, eg. 'WHERE DATE = '2014-01-01", and I'm wondering if there is a way to make it run fast when using the greater than expression.

The dimension table is about 5k rows, and the Fact table is ~60M.

Thanks in advance for any advice.

JT.

The query :

select sid, count(*) from fact fact_data fact left outer join dim_date dim on dim.date_id = fact.date_id where dim.date >= '2014-1-25' group by sid order by count desc limit 10;

FACT Table Definition:

Table "public.fact_data"

Column | Type | Modifiers

---------------+-----------------------------+-----------

date_id | integer |

date | timestamp without time zone |

agent_id | integer |

instance_id | integer |

sid | integer |

Indexes:

"fact_agent_id" btree (agent_id)

"fact_date_id" btree (date_id) CLUSTER

"fact_alarms_sid" btree (sid)

Table "public.dim_date"

Column | Type | Modifiers

--------------------+---------+------------------------------------------------------------

date_id | integer | not null default nextval('dim_date_date_id_seq'::regclass)

date | date |

year | integer |

month | integer |

month_name | text |

day | integer |

day_of_year | integer |

weekday_name | text |

calendar_week | integer |

quarter | text |

year_quarter | text |

year_month | text |

year_calendar_week | text |

weekend | text |

week_start_date | date |

week_end_date | date |

month_start_date | date |

month_end_date | date |

Indexes:

"dim_date_date" btree (date)

"dim_date_date_id" btree (date_id)

EXPLAIN Output:

explain (analyze, buffers) select dim.date_id, fact.sid, count(1) from fact_data fact left outer join dim_date dim on dim.date_id = fact.date_id where dim.date_id >= 5139 group by 1,2 order by 3 desc limit 10;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=9772000.55..9772000.58 rows=10 width=8) (actual time=91064.421..91064.440 rows=10 loops=1)