Обсуждение: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

Поиск
Список
Период
Сортировка

BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17858
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 15.2
Operating system:   Ubuntu 22.04
Description:

With a server compiled with flags -DUSE_VALGRIND
-DWRITE_READ_PARSE_PLAN_TREES the following query:
SELECT * FROM generate_series(1, 1) a WHERE a = ANY
(array[array[NULL::int]]);

provokes an incorrect memory access:
==00:00:00:06.362 433114== Conditional jump or move depends on uninitialised
value(s)
==00:00:00:06.362 433114==    at 0x76EF57: adjust_sign (snprintf.c:1483)
==00:00:00:06.362 433114==    by 0x76F404: fmtint (snprintf.c:1072)
==00:00:00:06.362 433114==    by 0x7700CD: dopr (snprintf.c:623)
==00:00:00:06.362 433114==    by 0x770509: pg_vsnprintf (snprintf.c:205)
==00:00:00:06.362 433114==    by 0x76612C: pvsnprintf (psprintf.c:110)
==00:00:00:06.362 433114==    by 0x7675FC: appendStringInfoVA
(stringinfo.c:149)
==00:00:00:06.362 433114==    by 0x767815: appendStringInfo
(stringinfo.c:103)
==00:00:00:06.362 433114==    by 0x4836BC: outDatum (outfuncs.c:298)
==00:00:00:06.362 433114==    by 0x4837F4: _outConst (outfuncs.c:1169)
==00:00:00:06.362 433114==    by 0x483C6E: outNode (outfuncs.c:4088)
==00:00:00:06.362 433114==    by 0x484787: _outList (outfuncs.c:234)
==00:00:00:06.362 433114==    by 0x483856: outNode (outfuncs.c:3896)
==00:00:00:06.362 433114==    by 0x488285: _outScalarArrayOpExpr
(outfuncs.c:1339)
==00:00:00:06.362 433114==    by 0x483D1E: outNode (outfuncs.c:4121)
==00:00:00:06.362 433114==    by 0x484787: _outList (outfuncs.c:234)
==00:00:00:06.362 433114==    by 0x483856: outNode (outfuncs.c:3896)
==00:00:00:06.362 433114==    by 0x484CCB: _outPlanInfo (outfuncs.c:353)
==00:00:00:06.362 433114==    by 0x485187: _outScanInfo (outfuncs.c:367)
==00:00:00:06.362 433114==    by 0x486235: _outFunctionScan
(outfuncs.c:656)
==00:00:00:06.362 433114==    by 0x483A2E: outNode (outfuncs.c:3980)
==00:00:00:06.362 433114==    by 0x484A19: _outPlannedStmt
(outfuncs.c:323)
==00:00:00:06.362 433114==    by 0x4838D9: outNode (outfuncs.c:3914)
==00:00:00:06.362 433114==    by 0x4908A5: nodeToString (outfuncs.c:4617)
==00:00:00:06.362 433114==    by 0x5C312E: pg_plan_query (postgres.c:913)
==00:00:00:06.362 433114==    by 0x5C31F5: pg_plan_queries
(postgres.c:975)
==00:00:00:06.362 433114==    by 0x5C36EA: exec_simple_query
(postgres.c:1169)
==00:00:00:06.362 433114==    by 0x5C566F: PostgresMain (postgres.c:4593)
==00:00:00:06.362 433114==    by 0x51F934: BackendRun (postmaster.c:4511)
==00:00:00:06.362 433114==    by 0x522A06: BackendStartup
(postmaster.c:4239)
==00:00:00:06.362 433114==    by 0x522C3F: ServerLoop (postmaster.c:1806)
==00:00:00:06.362 433114==    by 0x52420E: PostmasterMain
(postmaster.c:1478)
==00:00:00:06.362 433114==    by 0x467CB6: main (main.c:202)
==00:00:00:06.362 433114==  Uninitialised value was created by a heap
allocation
==00:00:00:06.362 433114==    at 0x74411F: palloc (mcxt.c:1093)
==00:00:00:06.362 433114==    by 0x3FECDB: ExecEvalArrayExpr
(execExprInterp.c:2856)
==00:00:00:06.362 433114==    by 0x402183: ExecInterpExpr
(execExprInterp.c:1335)
==00:00:00:06.362 433114==    by 0x3FE16A: ExecInterpExprStillValid
(execExprInterp.c:1826)
==00:00:00:06.362 433114==    by 0x4F00D4: ExecEvalExprSwitchContext
(executor.h:341)
==00:00:00:06.362 433114==    by 0x4F00D4: evaluate_expr (clauses.c:4823)
==00:00:00:06.362 433114==    by 0x4F1236: eval_const_expressions_mutator
(clauses.c:3098)
==00:00:00:06.362 433114==    by 0x480FFB: expression_tree_mutator
(nodeFuncs.c:3166)
==00:00:00:06.362 433114==    by 0x4F19AE: eval_const_expressions_mutator
(clauses.c:3527)
==00:00:00:06.362 433114==    by 0x4807B8: expression_tree_mutator
(nodeFuncs.c:2830)
==00:00:00:06.362 433114==    by 0x4F0971: eval_const_expressions_mutator
(clauses.c:2653)
==00:00:00:06.362 433114==    by 0x4F1B29: eval_const_expressions
(clauses.c:2107)
==00:00:00:06.362 433114==    by 0x4D0475: preprocess_expression
(planner.c:1127)
==00:00:00:06.362 433114==    by 0x4D058F: preprocess_qual_conditions
(planner.c:1199)
==00:00:00:06.362 433114==    by 0x4D918E: subquery_planner
(planner.c:815)
==00:00:00:06.362 433114==    by 0x4D9D14: standard_planner
(planner.c:406)
==00:00:00:06.362 433114==    by 0x4DA2C9: planner (planner.c:277)
==00:00:00:06.362 433114==    by 0x5C311A: pg_plan_query (postgres.c:883)
==00:00:00:06.362 433114==    by 0x5C31F5: pg_plan_queries
(postgres.c:975)
==00:00:00:06.362 433114==    by 0x5C36EA: exec_simple_query
(postgres.c:1169)
==00:00:00:06.362 433114==    by 0x5C566F: PostgresMain (postgres.c:4593)
==00:00:00:06.362 433114==    by 0x51F934: BackendRun (postmaster.c:4511)
==00:00:00:06.362 433114==    by 0x522A06: BackendStartup
(postmaster.c:4239)
==00:00:00:06.362 433114==    by 0x522C3F: ServerLoop (postmaster.c:1806)
==00:00:00:06.362 433114==    by 0x52420E: PostmasterMain
(postmaster.c:1478)
==00:00:00:06.362 433114==    by 0x467CB6: main (main.c:202)
==00:00:00:06.362 433114==

Here ExecEvalArrayExpr() performs:
        result = (ArrayType *) palloc(nbytes);
where nbytes includes ARR_OVERHEAD_WITHNULLS(ndims, nitems) for ndims = 2
and nitems = 1, so last 8 bytes in this memory area reserved for a null
bitmap, but only one bit of the bitmap initialised later by
array_bitmap_copy().

Something like:
@@ -2860,6 +2861,9 @@ ExecEvalArrayExpr(ExprState *state, ExprEvalStep
*op)
                result->elemtype = element_type;
                memcpy(ARR_DIMS(result), dims, ndims * sizeof(int));
                memcpy(ARR_LBOUND(result), lbs, ndims * sizeof(int));
+               nullbitmap = ARR_NULLBITMAP(result);
+               if (nullbitmap)
+                       memset(nullbitmap, 0, ARR_OVERHEAD_WITHNULLS(ndims,
nitems) - ARR_OVERHEAD_NONULLS(ndims));
 
                dat = ARR_DATA_PTR(result);
                iitem = 0;

fixes the issue, just as "result = (ArrayType *) palloc0(nbytes)" does, of
course.


Re: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
Richard Guo
Дата:

On Tue, Mar 21, 2023 at 5:32 PM PG Bug reporting form <noreply@postgresql.org> wrote:
With a server compiled with flags -DUSE_VALGRIND
-DWRITE_READ_PARSE_PLAN_TREES the following query:
SELECT * FROM generate_series(1, 1) a WHERE a = ANY
(array[array[NULL::int]]);

provokes an incorrect memory access:

Here ExecEvalArrayExpr() performs:
                result = (ArrayType *) palloc(nbytes);
where nbytes includes ARR_OVERHEAD_WITHNULLS(ndims, nitems) for ndims = 2
and nitems = 1, so last 8 bytes in this memory area reserved for a null
bitmap, but only one bit of the bitmap initialised later by
array_bitmap_copy().

Something like:
@@ -2860,6 +2861,9 @@ ExecEvalArrayExpr(ExprState *state, ExprEvalStep
*op)
                result->elemtype = element_type;
                memcpy(ARR_DIMS(result), dims, ndims * sizeof(int));
                memcpy(ARR_LBOUND(result), lbs, ndims * sizeof(int));
+               nullbitmap = ARR_NULLBITMAP(result);
+               if (nullbitmap)
+                       memset(nullbitmap, 0, ARR_OVERHEAD_WITHNULLS(ndims,
nitems) - ARR_OVERHEAD_NONULLS(ndims));

                dat = ARR_DATA_PTR(result);
                iitem = 0;

fixes the issue, just as "result = (ArrayType *) palloc0(nbytes)" does, of
course.

Nice catch.  This can also be seen on master.

I searched the codes a little bit and found that in array_set_slice()
and array_set_element() the 'newarray' is allocated with palloc0 and
then the nulls bitmap is zeroed with

 MemSet(nullbitmap, 0, (nitems + 7) / 8);

if havenulls is true.  I wonder if we can do the same here.

Thanks
Richard

Re: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
Alexander Lakhin
Дата:
Hi Richard,
21.03.2023 14:52, Richard Guo wrote:
Nice catch.  This can also be seen on master.

I searched the codes a little bit and found that in array_set_slice()
and array_set_element() the 'newarray' is allocated with palloc0 and
then the nulls bitmap is zeroed with

 MemSet(nullbitmap, 0, (nitems + 7) / 8);

if havenulls is true.  I wonder if we can do the same here.

I'm afraid that zeroing only bytes behind nitems bits is not enough, as outDatum() doesn't bother to calculate the exact size of nulls bitmap, it just outputs all bytes of a datum (40 bytes in that case):
   length = datumGetSize(value, typbyval, typlen);
...
            for (i = 0; i < length; i++)
               appendStringInfo(str, "%d ", (int) (s[i]));

Best regards,
Alexander
Alexander Lakhin <exclusion@gmail.com> writes:
> I'm afraid that zeroing only bytes behind nitems bits is not enough, as outDatum() doesn't bother to calculate the
exact 
> size of nulls bitmap, it just outputs all bytes of a datum (40 bytes in that case):

In that case, won't padding bytes between array elements also create
issues?  Seems like we have to just zero the whole array area, like
those other functions do.

            regards, tom lane



Re: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
Richard Guo
Дата:

On Tue, Mar 21, 2023 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Lakhin <exclusion@gmail.com> writes:
> I'm afraid that zeroing only bytes behind nitems bits is not enough, as outDatum() doesn't bother to calculate the exact
> size of nulls bitmap, it just outputs all bytes of a datum (40 bytes in that case):

In that case, won't padding bytes between array elements also create
issues?  Seems like we have to just zero the whole array area, like
those other functions do.

Yeah, this should be the right fix, to use palloc0 instead here.  FWIW
currently in the codes there are 14 places that explicitly allocate
ArrayType, 13 of them using palloc0, the only exception is the one
discussed here.

BTW, in array_set_slice() and array_set_element() we explicitly zero the
null bitmap although the whole array area is allocated with palloc0.  Is
this necessary?

Thanks
Richard

Re: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
Alexander Lakhin
Дата:
22.03.2023 06:07, Richard Guo wrote:

On Tue, Mar 21, 2023 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Lakhin <exclusion@gmail.com> writes:
> I'm afraid that zeroing only bytes behind nitems bits is not enough, as outDatum() doesn't bother to calculate the exact
> size of nulls bitmap, it just outputs all bytes of a datum (40 bytes in that case):

In that case, won't padding bytes between array elements also create
issues?  Seems like we have to just zero the whole array area, like
those other functions do.

Yeah, this should be the right fix, to use palloc0 instead here.  FWIW
currently in the codes there are 14 places that explicitly allocate
ArrayType, 13 of them using palloc0, the only exception is the one
discussed here.

BTW, in array_set_slice() and array_set_element() we explicitly zero the
null bitmap although the whole array area is allocated with palloc0.  Is
this necessary?


I had also wondered, how come this call of array_bitmap_copy() shown as covered here:
https://coverage.postgresql.org/src/backend/executor/execExprInterp.c.gcov.html
but valgrind doesn't complain during `make check` (with WRITE_READ_PARSE_PLAN_TREES).
I've found the query, which is reaching that call:
select percentile_disc(array[[null,1,0.5],[0.75,0.25,null]]) within group (order by thousand)
from tenk1;
and the absence of the valgrind complaint here is explained by the array coercion performed implicitly.
The query modification:
select percentile_disc(array[[null,1,0.5],[0.75,0.25,null]]::float8[]) within group (order by thousand)
from tenk1;
can confirm the fix (if needed).

With the following check:
if (result) VALGRIND_CHECK_MEM_IS_DEFINED(result, VARSIZE(result));
added at the end of ExecEvalArrayExpr() I see no failures during `make check` (except for select percentile_disc(...)) so it seems that uninitialised padding bytes between array elements are not produced with the existing regression tests. Though a comment in array.h says:
 * The actual data starts on a MAXALIGN boundary.  Individual items in the
 * array are aligned as specified by the array element type. ...

I'll try to find cases where the alignment padding is possible...

Though if palloc0() used in 13 places, maybe the minor optimization (with zeroing only the null bitmap) doesn't justify the inconsistency.

Best regards,
Alexander

Re: BUG #17858: ExecEvalArrayExpr() leaves uninitialised memory for multidim array with nulls

От
Alexander Lakhin
Дата:
22.03.2023 06:07, Richard Guo wrote:

On Tue, Mar 21, 2023 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Lakhin <exclusion@gmail.com> writes:
> I'm afraid that zeroing only bytes behind nitems bits is not enough, as outDatum() doesn't bother to calculate the exact
> size of nulls bitmap, it just outputs all bytes of a datum (40 bytes in that case):

In that case, won't padding bytes between array elements also create
issues?  Seems like we have to just zero the whole array area, like
those other functions do.

I came to a conclusion that the padding is not affected by
palloc/palloc0 there.
There we have:
            subdata[outer_nelems] = ARR_DATA_PTR(array);
            subbytes[outer_nelems] = ARR_SIZE(array) - ARR_DATA_OFFSET(array);
...
            memcpy(dat, subdata[i], subbytes[i]);
            dat += subbytes[i];

So the result array elements come packed one after another from the input
array and are not padded here.


Yeah, this should be the right fix, to use palloc0 instead here.  FWIW
currently in the codes there are 14 places that explicitly allocate
ArrayType, 13 of them using palloc0, the only exception is the one
discussed here.

BTW, in array_set_slice() and array_set_element() we explicitly zero the
null bitmap although the whole array area is allocated with palloc0.  Is
this necessary?

Yeah, I see two instances of MemSet(newnullbitmap, 0, ...):
1) In array_set_element():
        /* Zero the bitmap to take care of marking inserted positions null */
        MemSet(newnullbitmap, 0, (newnitems + 7) / 8);

It came with cecb60755 (2005-11-17).

2) In array_set_slice():
            /* Zero the bitmap to handle marking inserted positions null */
            MemSet(newnullbitmap, 0, (nitems + 7) / 8);

It came with 352a56ba6 (2006-09-29).

These MemSets were meaningful before 18c0b4ecc (2011-04-27).
The commit 18c0b4ecc changed palloc() to palloc0() almost everywhere, so
that the MemSets became redundant, but at the same time the discussed
palloc() (residing in execQual.c:ExecEvalArray() back then) somehow evaded
the change.

So maybe the fix for the bug can be seen as a supplement for 18c0b4ecc.

Best regards,
Alexander
Alexander Lakhin <exclusion@gmail.com> writes:
> So maybe the fix for the bug can be seen as a supplement for 18c0b4ecc.

Yeah, I agree.  Given that precedent, there's little excuse for
ExecEvalArrayExpr not to use palloc0; the fact that it doesn't
already was just an oversight in 18c0b4ecc.

I realized that there's a completely separate way in which
ExecEvalArrayExpr is a few bricks shy of a load: it fails to check
for overflow of the output array size.

Pushed the discussed changes along with a fix for that.

            regards, tom lane