Обсуждение: intarray: fix an edge case int32 overflow bug
Hi Hackers,
I noticed an int32 overflow problem in intarray’s compare_val_int4():
```
/*
* Comparison function for binary search in mcelem array.
*/
static int
compare_val_int4(const void *a, const void *b)
{
int32 key = *(int32 *) a;
const Datum *t = (const Datum *) b;
return key - DatumGetInt32(*t);
}
```
As this function is a bsearch comparator, it is supposed to return >0, =0 or <0. However this function uses subtraction with two int32 and returns an int, which may result in an overflow. Say, key is INT32_MAX and *t is -1, the return value will be negative due to overflow.
A simple C program shows the overflow:
```
#include <stdio.h>
#include <stdint.h>
#include <limits.h>
int main(void)
{
int32_t a = INT32_MAX;
int32_t b = -1;
int32_t diff = a - b; /* overflow */
printf("a = %d\n", a);
printf("b = %d\n", b);
printf("a - b = %d\n", diff);
return 0;
}
```
On my MacBook, the output is:
```
a = 2147483647
b = -1
a - b = -2147483648 <== INT32_MIN, but as a>b, we expect a positive value for bsearch comparator
```
The fix is straightforward: compare the key and *t without doing a potentially overflowing subtraction. Because no test covered this edge case, I added one. The test loads arrays [-2], [-1], and [INT32_MAX], ensuring -1 is the middle MCE value so the buggy comparator overflows on the first bsearch comparison and fail to find INT32_MAX. It then checks the EXPLAIN row estimation for a query on INT32_MAX. Before the fix, the lookup falls back to DEFAULT_EQ_SEL and EXPLAIN
reports rows=1; after the fix, it finds the MCE and reports the correct row estimation (the count of INT32_MAX rows).
Since it’s an edge case and doesn’t affect query results, I guess that’s why it went unnoticed for years.
Best regards,
Вложения
On Sun, 4 Jan 2026 at 16:20, Chao Li <li.evan.chao@gmail.com> wrote:
> I noticed an int32 overflow problem in intarray’s compare_val_int4():
> ```
> /*
> * Comparison function for binary search in mcelem array.
> */
> static int
> compare_val_int4(const void *a, const void *b)
> {
> int32 key = *(int32 *) a;
> const Datum *t = (const Datum *) b;
>
> return key - DatumGetInt32(*t);
> }
> ```
>
> As this function is a bsearch comparator, it is supposed to return >0, =0 or <0. However this function uses
subtractionwith two int32 and returns an int, which may result in an overflow. Say, key is INT32_MAX and *t is -1, the
returnvalue will be negative due to overflow.
Nice find. Was that found by a static analyser or by eye?
I can take care of the overflow issue. I feel the test is a step too
far as it seems unlikely ever to be rebroken, but thanks for the
SQL-based test case to demonstrate the issue.
David
On Sun, 4 Jan 2026 at 19:28, David Rowley <dgrowleyml@gmail.com> wrote: > I can take care of the overflow issue. I feel the test is a step too > far as it seems unlikely ever to be rebroken, but thanks for the > SQL-based test case to demonstrate the issue. Pushed. David
> On Jan 4, 2026, at 14:28, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sun, 4 Jan 2026 at 16:20, Chao Li <li.evan.chao@gmail.com> wrote:
>> I noticed an int32 overflow problem in intarray’s compare_val_int4():
>> ```
>> /*
>> * Comparison function for binary search in mcelem array.
>> */
>> static int
>> compare_val_int4(const void *a, const void *b)
>> {
>> int32 key = *(int32 *) a;
>> const Datum *t = (const Datum *) b;
>>
>> return key - DatumGetInt32(*t);
>> }
>> ```
>>
>> As this function is a bsearch comparator, it is supposed to return >0, =0 or <0. However this function uses
subtractionwith two int32 and returns an int, which may result in an overflow. Say, key is INT32_MAX and *t is -1, the
returnvalue will be negative due to overflow.
>
> Nice find. Was that found by a static analyser or by eye?
>
> I can take care of the overflow issue. I feel the test is a step too
> far as it seems unlikely ever to be rebroken, but thanks for the
> SQL-based test case to demonstrate the issue.
>
> David
Hi David,
It was spotted by eye. As a newcomer, I’m trying to get more familiar with the codebase, so while reviewing other
patchesI’ve been in the habit of poking around related files. In this case, the comparison function looked error-prone,
soI verified the overflow scenario with the small program. I didn’t post this one too quickly because I spent time
creatingthe test. :)
I added the test to demonstrate the issue and to prove the fix. If you think including the test is unnecessary and
preferto just take the fix, that’s absolutely fine with me.
Thanks again for taking care of this.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
> On Jan 4, 2026, at 15:35, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sun, 4 Jan 2026 at 19:28, David Rowley <dgrowleyml@gmail.com> wrote:
>> I can take care of the overflow issue. I feel the test is a step too
>> far as it seems unlikely ever to be rebroken, but thanks for the
>> SQL-based test case to demonstrate the issue.
>
> Pushed.
>
> David
I searched over the source tree to see if there is a similar buggy bsearch comparator and didn’t find any. However, I
founda set of comparators in common/int.h, so I think it’s better to use them here:
```
diff --git a/contrib/intarray/_int_selfuncs.c b/contrib/intarray/_int_selfuncs.c
index c3e19cdf27f..80faac8ae71 100644
--- a/contrib/intarray/_int_selfuncs.c
+++ b/contrib/intarray/_int_selfuncs.c
@@ -19,6 +19,7 @@
#include "catalog/pg_operator.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_type.h"
+#include "common/int.h"
#include "miscadmin.h"
#include "utils/fmgrprotos.h"
#include "utils/lsyscache.h"
@@ -331,10 +332,5 @@ compare_val_int4(const void *a, const void *b)
int32 key = *(int32 *) a;
int32 value = DatumGetInt32(*(const Datum *) b);
- if (key < value)
- return -1;
- else if (key > value)
- return 1;
- else
- return 0;
+ return pg_cmp_s32(key, value);
}
```
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/