On Sun, 2008-06-08 at 19:03 -0400, Tom Lane wrote:
> Your argument seems to consider only columns having a normal
> distribution.  How badly does it fall apart for non-normal
> distributions?  (For instance, Zipfian distributions seem to be pretty
> common in database work, from what I've seen.)
> 
If using "Idea 1: Keep an array of stadistinct that correspond to each
bucket size," I would expect it to still be a better estimate than it is
currently, because it's keeping a separate ndistinct for each histogram
bucket.
Regards,    Jeff Davis