On Wed, Nov 5, 2008 at 5:06 PM, Bryce Cutt <pandasuit@gmail.com> wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have attached all assertions are passing with your query and
> my test queries. I added another assertion to that section of the
> code so that it is a bit more vigorous in confirming the hash table
> partition is correct. It does not change the operation of the code.
>
> There are two partition counts. One holds the maximum number of
> buckets in the hash table and the other counts the number of actual
> buckets created for hash values. I was incorrectly testing against
> the second one because that was valid before I started using a hash
> table to store the buckets.
>
> The enable_hashjoin_usestatmcvs flag was valuable for my own research
> and tests and likely useful for your review but Tom is correct that it
> can be removed in the final version.
>
> - Bryce Cutt
Well, that builds nicely, lets me import the data, and I've seen a
performance improvement with enable_hashjoin_usestatmcvs on vs. off. I
plan to test that more formally (though probably not fully to the
extent you did in your paper; just enough to feel comfortable that I'm
getting similar results). Then I'll spend some time poking in the
code, for the relatively little good I feel I can do in that capacity,
and I'll also investigate scenarios with particularly inaccurate
statistics. Stay tuned.
- Josh