I've been trying to troubleshoot the cause of this PostGIS recheck bug we
have reported by two people so far. The last test was a nice simple
repeatable one that triggered the issue:
https://trac.osgeo.org/postgis/ticket/3418
from what I have seen this only affects cases where we are doing a distance
check between two points, which we actually don't need to enable recheck for
anyway, but trying to disable that seems like just shoving the real problem
under the covers.
Where it errors is this line 272 in src/backend/executor/nodeIndexscan
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/exe
cutor/nodeIndexscan.c;h=3143bd94ec4499fba94b41693538b785c4b32e6c;hb=HEAD#l27
2
/*259 * Was the ORDER BY value returned by the index accurate?
The260 * recheck flag means that the index can return inaccurate
values,261 * but then again, the value returned for any particular
tuple262 * could also be exactly correct. Compare the value
returned by263 * the index with the recalculated value. (If the value
returned264 * by the index happened to be exact right, we can often
avoid265 * pushing the tuple to the queue, just to pop it back out
again.)266 */267 cmp = cmp_orderbyvals(node->iss_OrderByValues,268
node->iss_OrderByNulls,269 scandesc->xs_orderbyvals,270
scandesc->xs_orderbynulls,271 node);272 if (cmp < 0)273
elog(ERROR, "index returned tuples in wrong order");274 else if (cmp == 0)275
was_exact= true;276 else277 was_exact = false;
If things are out of order, why isn't just going to was_exact = false good
enough?
I'm not sure if the mistake is in our PostGIS code or something in
PostgreSQL recheck logic.
If I change the elog(ERROR ...) to a elog(NOTICE, the answers are correct
and sort order is right.
Under what conditions would cmp return less than 0? I tried following the
code in cmp_orderbyvals, but got lost
and trying to put elog notices in to see what the distance is returning (I
probably did it wrong), just ended up crashing by backend.
Thanks for any thoughts,
Regina