Re: pgbench's expression parsing & negative numbers

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: pgbench's expression parsing & negative numbers
Дата
Msg-id 20180926194843.ra7rwvaezlbni4k7@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: pgbench's expression parsing & negative numbers  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: pgbench's expression parsing & negative numbers  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers
Hi,

On 2018-08-10 10:24:29 +0200, Fabien COELHO wrote:
> diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
> index 5c1bd88128..e8faea3857 100644
> --- a/src/bin/pgbench/exprscan.l
> +++ b/src/bin/pgbench/exprscan.l
> @@ -191,16 +191,26 @@ notnull            [Nn][Oo][Tt][Nn][Uu][Ll][Ll]
>                      yylval->bval = false;
>                      return BOOLEAN_CONST;
>                  }
> +"9223372036854775808" {
> +                    /* yuk: special handling for minint */
> +                    return MAXINT_PLUS_ONE_CONST;
> +                }

Yikes, do we really need this?  If we dealt with - in the lexer, we
shouldn't need it, no?


>  /*
>   * strtoint64 -- convert a string to 64-bit integer
>   *
> - * This function is a modified version of scanint8() from
> + * This function is a slightly modified version of scanint8() from
>   * src/backend/utils/adt/int8.c.
> + *
> + * The function returns whether the conversion worked, and if so
> + * "*result" is set to the result.
> + *
> + * If not errorOK, an error message is also printed out on errors.
>   */
> -int64
> -strtoint64(const char *str)
> +bool
> +strtoint64(const char *str, bool errorOK, int64 *result)
>  {
>      const char *ptr = str;
> -    int64        result = 0;
> -    int            sign = 1;
> +    int64        tmp = 0;
> +    bool        neg = false;
>  
>      /*
>       * Do our own scan, rather than relying on sscanf which might be broken
>       * for long long.
> +     *
> +     * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
> +     * value as a negative number.
>       */
>  
>      /* skip leading spaces */
> @@ -650,46 +660,80 @@ strtoint64(const char *str)
>      if (*ptr == '-')
>      {
>          ptr++;
> -
> -        /*
> -         * Do an explicit check for INT64_MIN.  Ugly though this is, it's
> -         * cleaner than trying to get the loop below to handle it portably.
> -         */
> -        if (strncmp(ptr, "9223372036854775808", 19) == 0)
> -        {
> -            result = PG_INT64_MIN;
> -            ptr += 19;
> -            goto gotdigits;
> -        }
> -        sign = -1;
> +        neg = true;
>      }
>      else if (*ptr == '+')
>          ptr++;
>  
>      /* require at least one digit */
> -    if (!isdigit((unsigned char) *ptr))
> -        fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
> +    if (unlikely(!isdigit((unsigned char) *ptr)))
> +        goto invalid_syntax;
>  
>      /* process digits */
>      while (*ptr && isdigit((unsigned char) *ptr))
>      {
> -        int64        tmp = result * 10 + (*ptr++ - '0');
> +        int8        digit = (*ptr++ - '0');
>  
> -        if ((tmp / 10) != result)    /* overflow? */
> -            fprintf(stderr, "value \"%s\" is out of range for type bigint\n", str);
> -        result = tmp;
> +        if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
> +            unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
> +            goto out_of_range;
>      }
>  
> -gotdigits:
> -
>      /* allow trailing whitespace, but not other trailing chars */
>      while (*ptr != '\0' && isspace((unsigned char) *ptr))
>          ptr++;
>  
> -    if (*ptr != '\0')
> -        fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
> +    if (unlikely(*ptr != '\0'))
> +        goto invalid_syntax;
>  
> -    return ((sign < 0) ? -result : result);
> +    if (!neg)
> +    {
> +        if (unlikely(tmp == PG_INT64_MIN))
> +            goto out_of_range;
> +        tmp = -tmp;
> +    }
> +
> +    *result = tmp;
> +    return true;
> +
> +out_of_range:
> +    if (!errorOK)
> +        fprintf(stderr,
> +                "value \"%s\" is out of range for type bigint\n", str);
> +    return false;
> +
> +invalid_syntax:
> +    /* this can't happen here, function called on int-looking strings. */

This comment doesn't strike me as a good idea, it's almost guaranteed to
be outdated at some point.

> +    if (!errorOK)
> +        fprintf(stderr,
> +                "invalid input syntax for type bigint: \"%s\"\n",str);
> +    return false;
> +}


Duplicating these routines is pretty ugly.  I really wish we had more
infrastructure to avoid this (in particularly a portable way to report
an error and jump out).  But probably out of scope for this patches?

> +/* convert string to double, detecting overflows/underflows */
> +bool
> +strtodouble(const char *str, bool errorOK, double *dv)
> +{
> +    char *end;
> +
> +    *dv = strtod(str, &end);
> +
> +    if (unlikely(errno != 0))
> +    {
> +        if (!errorOK)
> +            fprintf(stderr,
> +                    "value \"%s\" is out of range for type double\n", str);
> +        return false;
> +    }
> +
> +    if (unlikely(end == str || *end != '\0'))
> +    {
> +        if (!errorOK)
> +            fprintf(stderr,
> +                    "invalid input syntax for type double: \"%s\"\n",str);
> +        return false;
> +    }
> +    return true;
>  }

Not sure I see much point in the unlikelys here, contrasting to the ones
in the backend (and the copy for the frontend) it's extremely unlikley
anything like this will ever be close to a bottleneck.  In general, I'd
strongly suggest not placing unlikelys in non-critical codepaths -
they're way too often wrongly estimated.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: pgsql: Remove absolete function TupleDescGetSlot().
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [PATCH] Improve geometric types