On Fri, Mar 18, 2016 at 06:12:12PM -0700, Jeff Janes wrote:
> On Tue, Mar 15, 2016 at 8:36 AM, David Fetter <david@fetter.org> wrote:
> >
> > Please find attached a patch that uses the float8 version to cover the
> > numeric types.
>
> Is there a well-defined meaning for having a negative weight? If no,
> should it be disallowed?
Opinions on this appear to vary. A Wikipedia article defines weights
as non-negative, while a manual to which it refers only uses non-zero.
https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Mathematical_definition
https://www.gnu.org/software/gsl/manual/html_node/Weighted-Samples.html
I'm not sure which if either would be authoritative, but I could
certainly make up variants for each assumption.
The assumption they have in common about weights is that a zero weight
is not part of the calculation, which assumption is implemented in the
previously submitted code.
> I don't know what I was expecting, but not this:
>
> select weighted_avg(x,10000000-2*x) from generate_series(1,10000000) f(x);
> weighted_avg
> ------------------
> 16666671666717.1
I'm guessing that negative weights can cause bizarre outcomes,
assuming it turns out we should allow them.
> Also, I think it might not give the correct answer even without
> negative weights:
>
> create table foo as select floor(random()*10000)::int val from
> generate_series(1,10000000);
>
> create table foo2 as select val, count(*) from foo group by val;
>
> Shouldn't these then give the same result:
>
> select stddev_samp(val) from foo;
> stddev_samp
> -------------------
> 2887.054977297105
>
> select weighted_stddev_samp(val,count) from foo2;
> weighted_stddev_samp
> ----------------------
> 2887.19919651336
>
> The 5th digit seems too early to be seeing round-off error.
Please pardon me if I've misunderstood, but you appear to be assuming
that
SELECT val, count(*) FROM foo GROUP BY val
will produce precisely identical count(*)s at each row, which it
overwhelmingly likely won't, producing the difference you see above.
What have I misunderstood?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate