Обсуждение: Division by zero

Поиск
Список
Период
Сортировка

Division by zero

От
Oliver Kohll - Mailing Lists
Дата:
Hello,

We have a system that allows users to create views containing
calculations but divisions by zero are commonly a problem.

An simple example calculation in SQL would be

SELECT cost / pack_size AS unit_cost from products;

Either variable could be null or zero.

I don't think there's a way of returning null or infinity for
divisions by zero, rather than causing an error but I'd just like to
check - and put in a vote for that functionality!

If not, I will have to get the system to check for any numeric fields
in user input calculations and rewrite them similar to

CASE WHEN cost IS NULL THEN null
WHEN pack_size IS NULL THEN null
WHEN cost = 0 THEN null
WHEN pack_size = 0 THEN null
ELSE cost / pack_size
AS unit_cost

I don't want to write new functions, I'd rather keep it in plain SQL.

Best regards

Oliver Kohll


oliver@gtwm.co.uk / 0845 456 1810 / 07814 828608
www.gtwm.co.uk - company
www.gtportalbase.com - product

Re: Division by zero

От
Gurjeet Singh
Дата:
On Wed, Jun 3, 2009 at 9:18 PM, Oliver Kohll - Mailing Lists <oliver.lists@gtwm.co.uk> wrote:
Hello,

We have a system that allows users to create views containing calculations but divisions by zero are commonly a problem.

An simple example calculation in SQL would be

SELECT cost / pack_size AS unit_cost from products;

Either variable could be null or zero.

I don't think there's a way of returning null or infinity for divisions by zero, rather than causing an error but I'd just like to check - and put in a vote for that functionality!

If not, I will have to get the system to check for any numeric fields in user input calculations and rewrite them similar to

CASE WHEN cost IS NULL THEN null
WHEN pack_size IS NULL THEN null
WHEN cost = 0 THEN null
WHEN pack_size = 0 THEN null
ELSE cost / pack_size
AS unit_cost

I don't want to write new functions, I'd rather keep it in plain SQL.


Putting that in a function is definitely going to be expensive..

You need to take care of only one case here: denominator == 0; rest of the cases will be handled sanely by the database.

CASE WHEN packet_size = 0 THEN null ELSE cost/packet_size END as unit_cost

Best regards,

--
Lets call it Postgres

EnterpriseDB      http://www.enterprisedb.com

gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
Mail sent from my BlackLaptop device

Re: Division by zero

От
Sam Mason
Дата:
On Thu, Jun 04, 2009 at 12:12:09AM +0530, Gurjeet Singh wrote:
> On Wed, Jun 3, 2009 at 9:18 PM, Oliver Kohll wrote:
> > SELECT cost / pack_size AS unit_cost from products;
> >
> > Either variable could be null or zero.
> >
> > I don't want to write new functions, I'd rather keep it in plain SQL.
>
> Putting that in a function is definitely going to be expensive..

If it's an SQL function and marked as IMMUTABLE it should (I believe
anyway) get inlined somewhere along the line and take no more overhead
than writing it out in full.

> You need to take care of only one case here: denominator == 0; rest of the
> cases will be handled sanely by the database.
>
> CASE WHEN packet_size = 0 THEN null ELSE cost/packet_size END as unit_cost

Yes; or even shorter:

  cost/nullif(packet_size,0) AS unit_cost

--
  Sam  http://samason.me.uk/

Re: Division by zero

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> If it's an SQL function and marked as IMMUTABLE it should (I believe
> anyway) get inlined somewhere along the line and take no more overhead
> than writing it out in full.

Actually, if you're intending that a SQL function be inline-able then
it's best *not* to mark it as IMMUTABLE (nor STRICT either).  If the
marking matches the behavior of the contained expression then it
doesn't matter, but if the marking is stricter than the expression
it will prevent inlining.  Rather than think hard, I usually just don't
annotate the SQL function at all.

            regards, tom lane

Re: Division by zero

От
Sam Mason
Дата:
On Thu, Jun 04, 2009 at 09:48:17AM -0400, Tom Lane wrote:
> Sam Mason <sam@samason.me.uk> writes:
> > If it's an SQL function and marked as IMMUTABLE it should (I believe
> > anyway) get inlined somewhere along the line and take no more overhead
> > than writing it out in full.
>
> Actually, if you're intending that a SQL function be inline-able then
> it's best *not* to mark it as IMMUTABLE (nor STRICT either).  If the
> marking matches the behavior of the contained expression then it
> doesn't matter, but if the marking is stricter than the expression
> it will prevent inlining.

Why would strictness alter planning?  I was under the impression that it
only affected evaluation, i.e. it doesn't need to call the code if any
parameter is NULL.

That said, I was just saying what I've normally done.  I've just checked
the docs[1] and they may need rewording:

  For best optimization results, you should label your functions with
  the strictest volatility category that is valid for them.

> Rather than think hard, I usually just don't
> annotate the SQL function at all.

Hum, I only tend to annotate small functions where I can easily verify
what they're doing.  I think I picked up this practice when PG wasn't
inlining things I thought it should be and saying it was immutable made
it work.  Not sure when I got this habit, if I could drop it now that
would be nice.

--
  Sam  http://samason.me.uk/

 [1] http://www.postgresql.org/docs/current/static/xfunc-volatility.html

Re: Division by zero

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> On Thu, Jun 04, 2009 at 09:48:17AM -0400, Tom Lane wrote:
>> Actually, if you're intending that a SQL function be inline-able then
>> it's best *not* to mark it as IMMUTABLE (nor STRICT either).  If the
>> marking matches the behavior of the contained expression then it
>> doesn't matter, but if the marking is stricter than the expression
>> it will prevent inlining.

> Why would strictness alter planning?

Because it alters behavior.  For example, consider

create function foo(int) returns int as
$$ select coalesce($1, 42); $$ language sql strict;

Because this function is declared STRICT, it must return null on null
input.  However, the contained expression would not act that way ---
it would in fact return 42 for null input.  Therefore inlining would
change the behavior, and therefore we don't inline.

The converse case (function not marked strict, but expression really
is) doesn't pose a hazard.  So it's easier to just not mark the function
strict than to think carefully about whether (a) the contained
expression is strict and (b) the planner will be able to prove that.

The same sorts of considerations arise for marking the function as
less volatile than the contained expression really is.  In this case
the "behavioral change" has to do with what later phases of the planner
will think they can do with the function or expression.  The bottom line
is the same though: easier to leave off the marking.

            regards, tom lane

Re: Division by zero

От
Sam Mason
Дата:
On Thu, Jun 04, 2009 at 10:34:31AM -0400, Tom Lane wrote:
> create function foo(int) returns int as
> $$ select coalesce($1, 42); $$ language sql strict;
>
> Because this function is declared STRICT, it must return null on null
> input.  However, the contained expression would not act that way ---
> it would in fact return 42 for null input.  Therefore inlining would
> change the behavior, and therefore we don't inline.

Bah, not sure what I was thinking--that's kind of obvious isn't it!  I
think I was thinking about every language apart from SQL, but they can't
be inlined and hence it would never apply to them.

> The same sorts of considerations arise for marking the function as
> less volatile than the contained expression really is.  In this case
> the "behavioral change" has to do with what later phases of the planner
> will think they can do with the function or expression.  The bottom line
> is the same though: easier to leave off the marking.

Hum, I think that with 8.3 at least I'm going to carry on putting
IMMUTABLE on where I think it should be.  Consider:

  CREATE FUNCTION fn_i(INTEGER) RETURNS BOOLEAN LANGUAGE SQL IMMUTABLE AS $$
    SELECT $1 < 1000 $$;
  CREATE FUNCTION fn_v(INTEGER) RETURNS BOOLEAN LANGUAGE SQL AS $$
    SELECT $1 < 1000 $$;

and then doing:

  SELECT * FROM (
    SELECT fn_i(num), fn_v(num)
    FROM bigtable) x
  WHERE fn_i;

I get very different plans out if I replace "WHERE fn_i" with "WHERE
fn_v".  I read this as it not inlining where I'd expect it to be, or am
I missing something else?

--
  Sam  http://samason.me.uk/

Re: Division by zero

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> Hum, I think that with 8.3 at least I'm going to carry on putting
> IMMUTABLE on where I think it should be.  Consider:

>   CREATE FUNCTION fn_i(INTEGER) RETURNS BOOLEAN LANGUAGE SQL IMMUTABLE AS $$
>     SELECT $1 < 1000 $$;
>   CREATE FUNCTION fn_v(INTEGER) RETURNS BOOLEAN LANGUAGE SQL AS $$
>     SELECT $1 < 1000 $$;

> and then doing:

>   SELECT * FROM (
>     SELECT fn_i(num), fn_v(num)
>     FROM bigtable) x
>   WHERE fn_i;

> I get very different plans out if I replace "WHERE fn_i" with "WHERE
> fn_v".  I read this as it not inlining where I'd expect it to be, or am
> I missing something else?

Huh, interesting example.  I don't have time to trace through it right
now, but I think what is happening is that the decision about whether or
not the sub-select can be flattened is being made before the inlining of
the SQL functions in the sub-select happens.  So at that point the
sub-select qualifier expression still looks volatile and the planner
chickens out of flattening it.  The functions do both get inline'd
eventually, as you can see in EXPLAIN VERBOSE output ... but it's too
late to make any real difference in the plan shape.

So yeah, there are corner cases where it's useful to have the function
marked correctly rather than sloppily.

            regards, tom lane

Re: Division by zero

От
Sam Mason
Дата:
On Thu, Jun 04, 2009 at 11:17:39AM -0400, Tom Lane wrote:
> Sam Mason <sam@samason.me.uk> writes:
> > I think that with 8.3 at least I'm going to carry on putting
> > IMMUTABLE on where I think it should be.  Consider:
>
> there are corner cases where it's useful to have the function
> marked correctly rather than sloppily.

I think I originally noticed it when I'd written an "implies" operator
(i.e. (NOT $1) OR $2, as I don't trust myself to remember the logic)
and queries were running slowly until I noticed that a view was using
it deep inside itself somewhere.  Seems like views would trigger this
behavior pretty reliably.

Doesn't seem like it needs fixing too urgently though.

--
  Sam  http://samason.me.uk/

Re: Division by zero

От
Oliver Kohll - Mailing Lists
Дата:
On 4 Jun 2009, at 13:11, Sam Mason <sam@samason.me.uk> wrote:


You need to take care of only one case here: denominator == 0; rest of the
cases will be handled sanely by the database.

CASE WHEN packet_size = 0 THEN null ELSE cost/packet_size END as unit_cost

Yes; or even shorter:

 cost/nullif(packet_size,0) AS unit_cost

Thanks Sam and others. nullif is a good one to remember. However my problem is I want to be able to deal with an arbitrary calculation a user may create. In the light of a new day I realise this is obviously not trivial and would entail reasonably complex parsing. You'd have to find elements that could cause an error (division, some aggregates) and insert in the correct place nullif or CASE, taking care of bracket matching for starters - a messy workaround to the problem.

I might look into functions if that's the only reasonable way of catching exceptions.

Oliver