> Did you read by chance my previous message intitled
> "Generalized Top Queries on PostgreSQL"?
I vaguely recall it, but forget the details...
> - You cannot select the top N rows according to criterion A ordering
> the results with a different criterion B.
True, but I don't see how to do that with one indexscan (for that
matter, I don't even see how to express it in the SQL subset that
we support...)
> - If you ask for the best 10 rows, from a relation including
> 100000 rows, you have to do a traditional sort on 100000 rows and
> then retain only the first 10, doing more comparisons than requested.
Not if there's an index that implements the ordering --- and if there
is not, I don't see how to avoid the sort anyway.
> - You can choose a "fast-start" plan (i.e., basically,
> a pipelined plan), but you cannot performe an "early-stop" of
> the stream when you have a "slow-start" plan (e.g. involving sorts
> or hash tables).
Why not? The executor *will* stop when it has as many output rows as
the LIMIT demands.
> We noticed that this kind of plan often outperforms the first one.
I'd be the first to admit that the cost model needs some fine-tuning
still. It's just a conceptual structure at this point.
> Actually, we should say we can't figure out the reason for
> managing the LIMIT clause in a so complicated way, not providing
> a node in the plan as any other operator.
We will probably end up doing it like that sooner or later, in order to
allow attaching LIMIT to sub-selects. I don't take any credit or blame
for the execution-time implementation of LIMIT; I just worked with what
I found...
regards, tom lane