Hello everyone,
> Apparently, the planner isn't reusing the data boundaries across alternative
> plans. It would be nicer if the planner remembered each column boundaries
> for later reuse (within the same planner execution).
I've written a very naive (and crappy) patch to show how adding
memorization to get_actual_variable_range() could help the planner on
scenarios with a big number of joins.
For the previous example,
> explain (analyze, buffers)
> select * from a
> join b b1 on (b1.a = a.a)
> join b b2 on (b2.a = a.a)
> where b1.a in (1,100,10000,1000000,1000001);
each time you add a join clause the planner has to read an extra ~5[K]
buffers and gets about 200[ms] slower.
1 join
Planning:
Buffers: shared hit=9 read=27329
Planning Time: 101.745 ms
Execution Time: 0.082 ms
2 joins
Planning:
Buffers: shared hit=42 read=81988
Planning Time: 303.237 ms
Execution Time: 0.102 ms
3 joins
Planning:
Buffers: shared hit=94 read=136660
Planning Time: 508.947 ms
Execution Time: 0.155 ms
4 joins
Planning:
Buffers: shared hit=188 read=191322
Planning Time: 710.981 ms
Execution Time: 0.168 ms
After adding memorization the cost in buffers remains constant and the
latency deteriorates only marginally (as expected) with each join.
1 join
Planning:
Buffers: shared hit=10 read=27328
Planning Time: 97.889 ms
Execution Time: 0.066 ms
2 joins
Planning:
Buffers: shared hit=7 read=27331
Planning Time: 100.589 ms
Execution Time: 0.111 ms
3 joins
Planning:
Buffers: shared hit=9 read=27329
Planning Time: 105.669 ms
Execution Time: 0.134 ms
4 joins
Planning:
Buffers: shared hit=132 read=27370
Planning Time: 155.716 ms
Execution Time: 0.219 ms
I'd be happy to improve this patch into something better. Though I'd
like suggestions on how to do it:
I have this idea of creating a local "memorization" struct instance within
standard_planner(). That would require passing on a pointer down until
it reaches get_actual_variable_range(), which I think would be quite
ugly, if done just to improve the planner for this scenario.
Is there any better mechanism I could reuse from other modules? (utils
or cache, for example).
Regards,
Manuel