Обсуждение: Comments for lossy ORDER BY are lacking

Поиск
Список
Период
Сортировка

Comments for lossy ORDER BY are lacking

От
Andres Freund
Дата:
Hi,

For not the first time I was trying to remember why and when the whole
nodeIndexscan.c:IndexNextWithReorder() business is needed. The comment
about reordering

 *        IndexNextWithReorder
 *
 *        Like IndexNext, but this version can also re-check ORDER BY
 *        expressions, and reorder the tuples as necessary.

or
+   /* Initialize sort support, if we need to re-check ORDER BY exprs */

or

+   /*
+    * If there are ORDER BY expressions, look up the sort operators for
+    * their datatypes.
+    */


nor any other easy to spot ones really explain that. It's not even
obvious that this isn't talking about an ordering ordering by a column
(expression could maybe be taken as a hint, but that's fairly thin)

By reading enough code one can stitch together that that's really only
needed for KNN like order bys with lossy distance functions. It'd be
good if one had to dig less for that.


that logic was (originally) added in:

commit 35fcb1b3d038a501f3f4c87c05630095abaaadab
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   2015-05-15 14:26:51 +0300

    Allow GiST distance function to return merely a lower-bound.


but I think some of the documentation & naming for related
datastructures was a bit hard to grasp before then too - it's e.g. IMO
certainly not obvious that IndexPath.indexorderbys isn't about plain
ORDER BYs.

Greetings,

Andres Freund



Re: Comments for lossy ORDER BY are lacking

От
Andres Freund
Дата:
Hi,

On 2019-04-18 17:30:20 -0700, Andres Freund wrote:
> For not the first time I was trying to remember why and when the whole
> nodeIndexscan.c:IndexNextWithReorder() business is needed. The comment
> about reordering
> 
>  *        IndexNextWithReorder
>  *
>  *        Like IndexNext, but this version can also re-check ORDER BY
>  *        expressions, and reorder the tuples as necessary.
> 
> or
> +   /* Initialize sort support, if we need to re-check ORDER BY exprs */
> 
> or
> 
> +   /*
> +    * If there are ORDER BY expressions, look up the sort operators for
> +    * their datatypes.
> +    */

Secondary point: has anybody actually checked whether the extra
reordering infrastructure is a measurable overhead? It's obviously fine
for index scans that need reordering (i.e. lossy ones), but currently
it's at least initialized for distance based order bys.  I guess that's
largely because currently opclasses don't signal the fact that they
might return loss amcanorderby results, but that seems like it could
have been fixed back then?

Greetings,

Andres Freund