On Tue, 2005-04-26 at 15:00 -0700, Gurmeet Manku wrote:
> 2. In a single scan, it is possible to estimate n_distinct by using
> a very simple algorithm:
> "Distinct sampling for highly-accurate answers to distinct value
> queries and event reports" by Gibbons, VLDB 2001.
That looks like the one...
...though it looks like some more complex changes to the current
algorithm to use it, and we want the other stats as well...
> 3. In fact, Gibbon's basic idea has been extended to "sliding windows"
> (this extension is useful in streaming systems like Aurora / Stream):
> "Distributed streams algorithms for sliding windows"
> by Gibbons and Tirthapura, SPAA 2002.
...and this offers the possibility of calculating statistics at load
time, as part of the COPY command
Best Regards, Simon Riggs