On Mon, 4 Feb 2019 at 18:37, Edmund Horner <ejrh00@gmail.com> wrote:
> 1. v6-0001-Add-selectivity-estimate-for-CTID-system-variables.patch
I think 0001 is good to go. It's a clear improvement over what we do today.
(t1 = 1 million row table with a single int column.)
Patched:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
Seq Scan on t1 (cost=0.00..16925.00 rows=315 width=4) (actual
rows=315 loops=1)
# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
Seq Scan on t1 (cost=0.00..16925.00 rows=316 width=4) (actual
rows=316 loops=1)
Master:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
Seq Scan on t1 (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=315 loops=1)
# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
Seq Scan on t1 (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=316 loops=1)
The only possible risk I can foresee is that it may be more likely we
underestimate the selectivity and that causes something like a nested
loop join due to the estimation being, say 1 row.
It could happen in a case like:
SELECT * FROM bloated_table WHERE ctid >= <last ctid that would exist
without bloat>
but I don't think we should keep using DEFAULT_INEQ_SEL just in case
this happens. We could probably fix 90% of those cases by returning 2
rows instead of 1.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services