On Wed, 12 Oct 2022 at 13:06, Klint Gore <kgore4@une.edu.au> wrote:
> Limit (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.040 rows=1 loops=1)
> -> Unique (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.039 rows=1 loops=1)
> -> Index Only Scan using idx on tbl (cost=0.56..28349.28 rows=995241 width=4) (actual time=0.038..0.038
rows=1loops=1)
> Index Cond: (fld = 230)
> Heap Fetches: 0
> Planning Time: 0.066 ms
> Execution Time: 0.047 ms
>
> With the distinct and the limit, the planner somehow knows to push the either the distinct or the limit into the
indexonly scan so the unique for distinct only had 1 row and the outer limit only had 1 row. Without the limit, the
distinctstill does the index only scan but has to do the unique on the million rows and execution time goes to about
100ms.
I think that would be very simple to fix. I believe I've done that
locally but just detecting if needed_pathkeys == NULL in
create_final_distinct_paths().
i.e.
- if (pathkeys_contained_in(needed_pathkeys,
path->pathkeys))
+ if (needed_pathkeys == NIL)
+ {
+ Node *limitCount = makeConst(INT8OID,
-1, InvalidOid,
+
sizeof(int64),
+
Int64GetDatum(1), false,
+
FLOAT8PASSBYVAL);
+ add_path(distinct_rel, (Path *)
+
create_limit_path(root, distinct_rel, path, NULL,
+
limitCount, LIMIT_OPTION_COUNT, 0,
+
1));
+ }
+ else if
(pathkeys_contained_in(needed_pathkeys, path->pathkeys))
That just adds a Limit Path instead of the Unique Path. i.e:
postgres=# explain (analyze, costs off) select distinct a from t1 where a = 0;
QUERY PLAN
--------------------------------------------------------------------------------------
Limit (actual time=0.074..0.075 rows=1 loops=1)
-> Index Only Scan using t1_a_idx on t1 (actual time=0.072..0.073
rows=1 loops=1)
Index Cond: (a = 0)
Heap Fetches: 1
Planning Time: 0.146 ms
Execution Time: 0.100 ms
(6 rows)
However, I might be wrong about that. I've not given it too much thought.
David