On Mon, Dec 8, 2008 at 10:24 AM, Gregory Stark <stark@enterprisedb.com> wrote:
> I tried a different query, trying to get quadratic growth and again failed. It
The profiling results I sent the other day show an exactly-linear
increase in the number of times eqjoinsel invokes FunctionCall2.
Reading through the the eqjoinsel_inner loop in selfuncs.c beginning
around line 2042, I think what is happening is this: since the two
tables are really the same table, nvalues1 and nvalues2 are the same
array, and therefore contain the same elements in the same order. As
a result, for each i, we skip over the first i - 1 entries in
nvalues2, which have already been matched, and then compare element i
of nvalues1 to element i of nvalues2 and mark them both as matched.
Although this is technically an O(n^2) algorithm, the constant is very
low, because this code is Really Fast:
if (hasmatch[j]) continue;
To get observable quadratic behavior, I think you might need to
construct two MCV arrays where all the values are the same, but the
arrays are not in the same order. I believe they are sorted by
frequency, so it might be sufficient to arrange things so that the
N'th most common value in one table is the (statistics_target + 1 -
N)'th most common value in the other. It might also help to use a
function with a real slow comparison function (perhaps one
intentionally constructed to be slow).
...Robert