Обсуждение: Standard Deviation function.
Hello,
I really need a Standard Deviation aggregate function so I will
try to write one.
I know about the man pages for "create aggregate" and "create
function". Is there something else I should look at?
Just a few pointers could save me a few hours of hunting around.
All advice accepted.
It seems kind of hard to do with only two state functions unless
I "cheat". I need to keep three values, Count, Sum, and Sum of
Squares. I could use three static variables and have the final
function ignore its input and use the static vars instead. This
will likely blow up if the new Standard Deviation aggregate is
used twice in the same select.
Any hints or advice??
If someone has this done already let me know.
I may want do a "median" aggregate function too as I'll need that
later. This would require private storage and a sort.
Could you cc me at both addresses below as I move around between
them
Thanks,
--
--Chris Albertson
chrisja@jps.net
chris@topdog.logicon.com Voice: 626-351-0089 X127
Fax: 626-351-0699
> I really need a Standard Deviation aggregate function... > > I know about the man pages for "create aggregate" and "create > function". Is there something else I should look at? > > It seems kind of hard to do with only two state functions unless > I "cheat". I need to keep three values, Count, Sum, and Sum of > Squares. > > Any hints or advice?? I thought about this a long time ago and had an idea but never got around to trying to implement it. I was going to have some functions that worked on a structure of two doubles to track the sum and square instead of using only one simple type. darrenk
> > > I really need a Standard Deviation aggregate function... > > > > I know about the man pages for "create aggregate" and "create > > function". Is there something else I should look at? > > > > It seems kind of hard to do with only two state functions unless > > I "cheat". I need to keep three values, Count, Sum, and Sum of > > Squares. > > > > Any hints or advice?? > > I thought about this a long time ago and had an idea but never > got around to trying to implement it. I was going to have some > functions that worked on a structure of two doubles to track > the sum and square instead of using only one simple type. I remember talking about this to someone, and the problem is that you needed the average WHILE scanning through the table, which required two passes, which the aggregate system is not designed to do. I may be wrong on this, though. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > I really need a Standard Deviation aggregate function... > > > > I thought about this a long time ago and had an idea but never > > got around to trying to implement it. I was going to have some > > functions that worked on a structure of two doubles to track > > the sum and square instead of using only one simple type. > > I remember talking about this to someone, and the problem is that you > needed the average WHILE scanning through the table, which required two > passes, which the aggregate system is not designed to do. I may be > wrong on this, though. I had asked you how to calculate this and the variance early last year. One (I think the variance) was two-pass because of the need for the average, but I thought the StdDev would work with the struct. Been a while and I still haven't configured #(*&^ FreeBSD ppp yet. darrenk
> I had asked you how to calculate this and the variance early last > year. One (I think the variance) was two-pass because of the need > for the average, but I thought the StdDev would work with the struct. Variance is just square of std. dev, no? > > Been a while and I still haven't configured #(*&^ FreeBSD ppp yet. Bummer. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
>
> > > > I really need a Standard Deviation aggregate function...
> > >
> > > I thought about this a long time ago and had an idea but never
> > > got around to trying to implement it. I was going to have some
> > > functions that worked on a structure of two doubles to track
> > > the sum and square instead of using only one simple type.
> >
> > I remember talking about this to someone, and the problem is that you
> > needed the average WHILE scanning through the table, which required two
> > passes, which the aggregate system is not designed to do. I may be
> > wrong on this, though.
>
> I had asked you how to calculate this and the variance early last
> year. One (I think the variance) was two-pass because of the need
> for the average, but I thought the StdDev would work with the struct.
>
> Been a while and I still haven't configured #(*&^ FreeBSD ppp yet.
The Perl Module "Statistics/Descriptive" has on the fly variance calculation.
sub add_data {
my $self = shift; ##Myself
my $oldmean;
my ($min,$mindex,$max,$maxdex);
##Take care of appending to an existing data set
$min = (defined ($self->{min}) ? $self->{min} : $_[0]);
$max = (defined ($self->{max}) ? $self->{max} : $_[0]);
$maxdex = $self->{maxdex} || 0;
$mindex = $self->{mindex} || 0;
##Calculate new mean, pseudo-variance, min and max;
foreach (@_) {
$oldmean = $self->{mean};
$self->{sum} += $_;
$self->{count}++;
if ($_ >= $max) {
$max = $_;
$maxdex = $self->{count}-1;
}
if ($_ <= $min) {
$min = $_;
$mindex = $self->{count}-1;
}
$self->{mean} += ($_ - $oldmean) / $self->{count};
$self->{pseudo_variance} += ($_ - $oldmean) * ($_ - $self->{mean});
}
$self->{min} = $min;
$self->{mindex} = $mindex;
$self->{max} = $max;
$self->{maxdex} = $maxdex;
$self->{sample_range} = $self->{max} - $self->{min};
if ($self->{count} > 1) {
$self->{variance} = $self->{pseudo_variance} / ($self->{count} -1);
$self->{standard_deviation} = sqrt( $self->{variance});
}
return 1;
}
-dg
David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Don't worry about people stealing your ideas. If your ideas are any
good, you'll have to ram them down people's throats." -- Howard Aiken