On 30.6.2014 23:12, Tomas Vondra wrote:
> Hi,
>
> attached is v5 of the patch. The main change is that scaling the number
> of buckets is done only once, after the initial hash table is build. The
> main advantage of this is lower price. This also allowed some cleanup of
> unecessary code.
>
> However, this new patch causes warning like this:
>
> WARNING: temporary file leak: File 231 still referenced
>
> I've been eyeballing the code for a while now, but I still fail to see
> where this comes from :-( Any ideas?
Meh, the patches are wrong - I haven't realized the tight coupling
between buckets/batches in ExecHashGetBucketAndBatch:
*bucketno = hashvalue & (nbuckets - 1);
*batchno = (hashvalue >> hashtable->log2_nbuckets) & (nbatch - 1);
The previous patches worked mostly by pure luck (the nbuckets was set
only in the first batch), but once I moved the code at the end, it
started failing. And by "worked" I mean "didn't throw an error, but
probably returned bogus results with (nbatch>1)".
However, ISTM this nbuckets-nbatch coupling is not really necessary,
it's only constructed this way to assign independent batch/bucket. So I
went and changed the code like this:
*bucketno = hashvalue & (nbuckets - 1);
*batchno = (hashvalue >> (32 - hashtable->log2_nbatch));
I.e. using the top bits for batchno, low bits for bucketno (as before).
Hopefully I got it right this time. At least it seems to be working for
cases that failed before (no file leaks, proper rowcounts so far).
regards
Tomas