On Thu, Dec 8, 2016 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes: > Maybe it would help for Jeff to use elog_node_display() to the nodes > that are causing the problem - e.g. outerpathkeys and innerpathkeys > and best_path->path_mergeclauses, or just best_path - at the point > where the error is thrown. That might give us enough information to > see what's broken.
I'll be astonished if that's sufficient evidence. We already know that the problem is that the input path doesn't claim to be sorted in a way that would match the merge clauses, but that doesn't tell us how such a path came to be generated (or, if it wasn't intentionally done, where the data structure got clobbered later).
It's possible that setting a breakpoint at create_mergejoin_path and capturing stack traces for all calls would yield usable insight. But there are likely to be lots of calls if this is an 8-way join query, and probably only a few are wrong.
I'd much rather have a test case than try to debug this remotely. Bandwidth too low.
I didn't get an asserts on the assert-enabled build.
I have a test case where I made the fdw connect back to itself, and stripped out all the objects that I could and still reproduce the case. It is large, 21MB compressed, 163MB uncompressed, so I am linking it here:
When running this against and default configured 9.6.1 or 10devel-a73491e, I get the error.
psql -U postgres -f <(xzcat combined_anon.sql.xz)
...
ERROR: outer pathkeys do not match mergeclauses
STATEMENT: explain update local1 set local19 = jj_join.local19 from jj_join where local1.local12=jj_join.local12 and local1.local12 in ('ddd','bbb','aaa','abc');