Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query
Дата
Msg-id 5794.1578342514@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-bugs
Andrey Borodin <x4mmm@yandex-team.ru> writes:
> PFA possible fix for this.

I looked into this a bit too.  I don't have a lot of faith in Michael's
example; it took me four tries to find a machine it would fail on.
Still, I did eventually get a crash, and I concur that the problem is
that calc_hist_selectivity_contained tries to access the entry after
the histogram end.

I believe that Michael's draft fix is actually about right, because
we don't want to treat the key space above the histogram upper bound
as being an actual bin: we should assume that there are no values there.
So just starting the loop at the last real bin is sufficient.  However,
I'd prefer to handle the two edge cases (i.e., probe value below histogram
lower bound or above histogram upper bound) explicitly, because that makes
it more apparent what's going on.

There are a couple of other ways in which this code seems insufficiently
paranoid to me:

* It seems worth spending a couple lines of code at the beginning to
verify that the histogram has at least one bin, as it already did for
the range-length histogram.  Otherwise we've got various divide-by-zero
hazards here, along with a need for extra tests in the looping functions.

* I think we'd better guard against NaNs coming back from the range
subdiff function, as well as the possibility of getting a NaN from
the division in get_position (compare [1]).

* I am not quite totally sure about this part, but it seems to me
that calc_hist_selectivity_contains probably ought to handle the
range-bound cases the same as calc_hist_selectivity_contained,
even though only the latter is trying to access an unsafe array
index.

That all leads me to the attached draft patch.  It lacks a test
case --- I wonder if we can find one that crashes more portably
than Michael's?  And more eyeballs on calc_hist_selectivity_contains
would be good too.

            regards, tom lane

[1] https://www.postgresql.org/message-id/12957.1577974305@sss.pgh.pa.us

diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c
index a4d7a7a..0a41016 100644
--- a/src/backend/utils/adt/rangetypes_selfuncs.c
+++ b/src/backend/utils/adt/rangetypes_selfuncs.c
@@ -400,6 +400,13 @@ calc_hist_selectivity(TypeCacheEntry *typcache, VariableStatData *vardata,
                            ATTSTATSSLOT_VALUES)))
         return -1.0;

+    /* check that it's a histogram, not just a dummy entry */
+    if (hslot.nvalues < 2)
+    {
+        free_attstatsslot(&hslot);
+        return -1.0;
+    }
+
     /*
      * Convert histogram of ranges into histograms of its lower and upper
      * bounds.
@@ -696,21 +703,22 @@ get_position(TypeCacheEntry *typcache, const RangeBound *value, const RangeBound
             return 0.5;

         /* Calculate relative position using subdiff function. */
-        bin_width = DatumGetFloat8(FunctionCall2Coll(
-                                                     &typcache->rng_subdiff_finfo,
+        bin_width = DatumGetFloat8(FunctionCall2Coll(&typcache->rng_subdiff_finfo,
                                                      typcache->rng_collation,
                                                      hist2->val,
                                                      hist1->val));
-        if (bin_width <= 0.0)
+        if (isnan(bin_width) || bin_width <= 0.0)
             return 0.5;            /* zero width bin */

-        position = DatumGetFloat8(FunctionCall2Coll(
-                                                    &typcache->rng_subdiff_finfo,
+        position = DatumGetFloat8(FunctionCall2Coll(&typcache->rng_subdiff_finfo,
                                                     typcache->rng_collation,
                                                     value->val,
                                                     hist1->val))
             / bin_width;

+        if (isnan(position))
+            return 0.5;            /* punt if we have any NaNs or Infs */
+
         /* Relative position must be in [0,1] range */
         position = Max(position, 0.0);
         position = Min(position, 1.0);
@@ -806,11 +814,19 @@ get_distance(TypeCacheEntry *typcache, const RangeBound *bound1, const RangeBoun
          * value of 1.0 if no subdiff is available.
          */
         if (has_subdiff)
-            return
-                DatumGetFloat8(FunctionCall2Coll(&typcache->rng_subdiff_finfo,
-                                                 typcache->rng_collation,
-                                                 bound2->val,
-                                                 bound1->val));
+        {
+            float8        res;
+
+            res = DatumGetFloat8(FunctionCall2Coll(&typcache->rng_subdiff_finfo,
+                                                   typcache->rng_collation,
+                                                   bound2->val,
+                                                   bound1->val));
+            /* Protect against possible NaN result */
+            if (isnan(res))
+                return 1.0;
+            else
+                return res;
+        }
         else
             return 1.0;
     }
@@ -1021,16 +1037,28 @@ calc_hist_selectivity_contained(TypeCacheEntry *typcache,
                                  false);

     /*
+     * If the upper bound value is below the histogram's lower limit, there
+     * are no matches.
+     */
+    if (upper_index < 0)
+        return 0.0;
+
+    /*
+     * If the upper bound value is at or beyond the histogram's upper limit,
+     * start our loop at the last actual bin.  (This corresponds to assuming
+     * that the data population above the histogram's upper limit is empty,
+     * exactly like what we just assumed for the lower limit.)
+     */
+    upper_index = Min(upper_index, hist_nvalues - 2);
+
+    /*
      * Calculate upper_bin_width, ie. the fraction of the (upper_index,
      * upper_index + 1) bin which is greater than upper bound of query range
      * using linear interpolation of subdiff function.
      */
-    if (upper_index >= 0 && upper_index < hist_nvalues - 1)
-        upper_bin_width = get_position(typcache, upper,
-                                       &hist_lower[upper_index],
-                                       &hist_lower[upper_index + 1]);
-    else
-        upper_bin_width = 0.0;
+    upper_bin_width = get_position(typcache, upper,
+                                   &hist_lower[upper_index],
+                                   &hist_lower[upper_index + 1]);

     /*
      * In the loop, dist and prev_dist are the distance of the "current" bin's
@@ -1125,15 +1153,27 @@ calc_hist_selectivity_contains(TypeCacheEntry *typcache,
                                  true);

     /*
+     * If the lower bound value is below the histogram's lower limit, there
+     * are no matches.
+     */
+    if (lower_index < 0)
+        return 0.0;
+
+    /*
+     * If the lower bound value is at or beyond the histogram's upper limit,
+     * start our loop at the last actual bin.  (This corresponds to assuming
+     * that the data population above the histogram's upper limit is empty,
+     * exactly like what we just assumed for the lower limit.)
+     */
+    lower_index = Min(lower_index, hist_nvalues - 2);
+
+    /*
      * Calculate lower_bin_width, ie. the fraction of the of (lower_index,
      * lower_index + 1) bin which is greater than lower bound of query range
      * using linear interpolation of subdiff function.
      */
-    if (lower_index >= 0 && lower_index < hist_nvalues - 1)
-        lower_bin_width = get_position(typcache, lower, &hist_lower[lower_index],
-                                       &hist_lower[lower_index + 1]);
-    else
-        lower_bin_width = 0.0;
+    lower_bin_width = get_position(typcache, lower, &hist_lower[lower_index],
+                                   &hist_lower[lower_index + 1]);

     /*
      * Loop through all the lower bound bins, smaller than the query lower

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: BUG #16183: PREPARED STATEMENT slowed down by jit
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #16190: The usage of NULL pointer in refint.c