Обсуждение: a math question
I have a math question and a benchmark question and I'm not sure how to benchmark it. What I'm trying to do is use pgsql as a bayes token store for a spam filter I'm writing. In doing this I have a data structure with index keys and two integer fields 'h_msgs' and 's_msgs' for each token and another pair for each user (H_msgs, S_msgs), making four data pieces for each user-token relationship. for Bayes these are run through an equation of the form: (s_msgs/S_msgs)/(s_msgs/S_msgs + h_msgs/H_msgs) Which I currently do in perl. In pgsql I have to modify this a bit with 'cast (s_msgs as double precision)' or 'cast(s_msgs as real)' in order to get floating point math. ( cast(s_msgs as double precision)/S_msgs) and so on... Question: Is there a better way to get floating point math out of a set of integers? Thought occurred to me that if I let pgsql do this, it should be considerably faster since perl is slower than C. But I don't know if I have any good way of proving this. The data retrieval process tends to dwarf everything else -- which may mean I really shouldn't waste my time with this anyways. But I was wondering if the thinking is valid, and how I might benchmark the differences.
tom wrote: > In pgsql I have to modify this a bit with 'cast (s_msgs as double > precision)' or 'cast(s_msgs as real)' in order to get floating point math. > ( cast(s_msgs as double precision)/S_msgs) and so on... > > Question: Is there a better way to get floating point math out of a set > of integers? Nope. This is mentioned in the docs: http://www.postgresql.org/docs/8.2/static/functions-math.html division (integer division truncates results) I'm sure it's because of sql specs but someone else will throw their 2c's in if that's wrong ;) You only need one real or double precision field in there for that not to be truncated, you don't need to cast everything. -- Postgresql & php tutorials http://www.designmagick.com/
On 4/26/2007, "Chris" <dmagick@gmail.com> wrote: >tom wrote: > >> In pgsql I have to modify this a bit with 'cast (s_msgs as double >> precision)' or 'cast(s_msgs as real)' in order to get floating point math. >> ( cast(s_msgs as double precision)/S_msgs) and so on... >> >> Question: Is there a better way to get floating point math out of a set >> of integers? > >Nope. The way they treat math isn't new. cast one as real and the rest will follow. Any idea if it's going to be "better" or even something that can realistically be benchmarked?