"Jim C. Nasby" <decibel@decibel.org> writes:
> In english, each bucket defines a specific time period, and no two
> buckets can over-lap (though there's no constraints defined to actually
> prevent that). So reality is that each row in page_log.log will in fact
> only match one row in bucket (at least for each value of rrs_id).
> Given that, would the optimizer make a better choice if it knew that
> (since it means a much smaller result set).
Given that the join condition is not an equality, there's no hope of
using hash or merge join; so the join itself is about as good as you're
gonna get. With a more accurate rows estimate for the join result, it
might have decided to use HashAggregate instead of Sort/GroupAggregate,
but AFAICS that would not have made a huge difference ... at best maybe
25% of the total query time.
> Is there any way to tell the
> optimizer this is the case?
Nope. This gets back to the old problem of not having any cross-column
(cross-table in this case) statistics.
regards, tom lane