Re: Remove dependence on integer wrapping

Поиск
Список
Период
Сортировка
От Joseph Koshakow
Тема Re: Remove dependence on integer wrapping
Дата
Msg-id CAAvxfHfdPgtqFqsb=Nxa1oc6GsQ_U+k3rfneo-EjYG4bne_ksw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Remove dependence on integer wrapping  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: Remove dependence on integer wrapping
Список pgsql-hackers


On Fri, Jul 19, 2024 at 2:45 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> +            /* dim[i] = 1 + upperIndx[i] - lowerIndx[i]; */
> +            if (pg_add_s32_overflow(1, upperIndx[i], &dim[i]))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> +                         errmsg("array upper bound is too large: %d",
> +                                upperIndx[i])));
> +            if (pg_sub_s32_overflow(dim[i], lowerIndx[i], &dim[i]))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> +                         errmsg("array size exceeds the maximum allowed (%d)",
> +                                (int) MaxArraySize)));
>
> I think the problem with fixing it this way is that it prohibits more than
> is necessary.

My understanding is that 2147483647 (INT32_MAX) is not a valid upper
bound, which is what the first overflow check is checking. Any query of
the form
    `INSERT INTO arroverflowtest(i[<lb>:2147483647]) VALUES ('{...}');`
will fail with an error of
    `ERROR:  array lower bound is too large: <lb>`

The reason is the following bounds check found in arrayutils.c

    /*
     * Verify sanity of proposed lower-bound values for an array
     *
     * The lower-bound values must not be so large as to cause overflow when
     * calculating subscripts, e.g. lower bound 2147483640 with length 10
     * must be disallowed.  We actually insist that dims[i] + lb[i] be
     * computable without overflow, meaning that an array with last subscript
     * equal to INT_MAX will be disallowed.
     *
     * It is assumed that the caller already called ArrayGetNItems, so that
     * overflowed (negative) dims[] values have been eliminated.
     */
    void
    ArrayCheckBounds(int ndim, const int *dims, const int *lb)
    {
        (void) ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
    }

    /*
     * This entry point can return the error into an ErrorSaveContext
     * instead of throwing an exception.
     */
    bool
    ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
    struct Node *escontext)
    {
        int i;

        for (i = 0; i < ndim; i++)
        {
        /* PG_USED_FOR_ASSERTS_ONLY prevents variable-isn't-read warnings */
        int32 sum PG_USED_FOR_ASSERTS_ONLY;
   
        if (pg_add_s32_overflow(dims[i], lb[i], &sum))
            ereturn(escontext, false,
                    (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                     errmsg("array lower bound is too large: %d",
                            lb[i])));
        }

         return true;
    }

Specifically "We actually insist that dims[i] + lb[i] be computable
without overflow, meaning that an array with last subscript equal to
INT32_MAX will be disallowed." If the upper bound is INT32_MAX,
then there's no lower bound where (lower_bound + size) won't overflow.

It might be possible to remove this restriction, but it's probably
easier to keep it.

> An easy way to deal with this problem is to first perform the calculation
> with everything cast to an int64.  Before setting dim[i], you'd check that
> the result is in [PG_INT32_MIN, PG_INT32_MAX] and fail if needed.
>
>         int64 newdim;
>
>         ...
>
>         newdim = (int64) 1 + (int64) upperIndx[i] - (int64) lowerIndx[i];
>         if (unlikely(newdim < PG_INT32_MIN || newdim > PG_INT32_MAX))
>                 ereport(ERROR,
>                                 ...
>         dim[i] = (int32) newdim;

I've rebased my patches and updated 0002 with this approach if this is
still the approach you want to go with. I went with the array size too
large error for similar reasons as the previous version of the patch.

Since the patches have been renumbered, here's an overview of their
status:

- 0001 is reviewed and waiting for v18.
- 0002 is under review and a bug fix.
- 0003 needs review.

Thanks,
Joseph Koshakow
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Wen Yi"
Дата:
Сообщение: Re:How can udf c function return table, not the rows?
Следующее
От: Paul George
Дата:
Сообщение: Re: behavior of GROUP BY with VOLATILE expressions