I wrote:
> If I set both collapse_limit variables to very high values (I used 999),
> it takes ... um ... not sure; I gave up waiting after half an hour.
> I also tried with geqo_effort reduced to the minimum of 1, but that
> didn't produce a plan in reasonable time either (I gave up after ten
> minutes).
After I gave up letting the machine be idle to get a fair timing,
I turned on oprofile monitoring. It looks a bit interesting:
samples % image name symbol name
886498 53.8090 postgres have_relevant_eclass_joinclause
460596 27.9574 postgres bms_overlap
142764 8.6655 postgres bms_is_subset
126274 7.6646 postgres have_join_order_restriction
14205 0.8622 postgres list_nth_cell
2721 0.1652 postgres generate_join_implied_equalities
2445 0.1484 libc-2.9.so memset
2202 0.1337 postgres have_relevant_joinclause
1678 0.1019 postgres make_canonical_pathkey
1648 0.1000 postgres pfree
884 0.0537 postgres bms_union
762 0.0463 postgres gimme_tree
660 0.0401 libc-2.9.so memcpy
571 0.0347 postgres AllocSetFree
475 0.0288 postgres AllocSetAlloc
431 0.0262 postgres has_relevant_eclass_joinclause
389 0.0236 postgres check_list_invariants
260 0.0158 postgres join_is_legal
238 0.0144 postgres bms_copy
So maybe a redesign of the equivalence-class joinclause mechanism is in
order. Still, this is unlikely to fix the fundamental issue that the
time for large join problems grows nonlinearly.
regards, tom lane