Darren King wrote:
>
[examples deleted]
> I see two ways to fix the above, one w/minimal code, second w/more work, but
> potentially better speed for large queries.
>
> 1. Put a sort node immediately before the group node, taking into account
> any user given ordering. Also make sure the optimizer is aware of this sort
> when calculating query costs.
>
> 2. Instead of sorting the tuples before grouping, add a hashing system to
> the group node so that the pre-sorting is not necessary.
>
> Hmmm...is this a grouping problem or an aggregate problem? Or both? The first
> query above should have the data sorted before aggregating, shouldn't it, or I
> am still missing a piece of this puzzle?
>
> darrenk
The hash should work. If the hash key is built on the group-by items,
then any row with the same entries in these columns will get hashed to
the same result row. At this point, it should be fairly easy to
perform aggregation (test and substitute for min and max, add for
sum,avg, etc).
Ocie