On Fri, Dec 18, 2015 at 9:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Attached revision updates both the main commit (the optimization), and
>> the backpatch commit (updated the contract).
>
> - /* abbreviation is possible here only for by-reference types */
> + /*
> + * Abbreviation is possible here only for by-reference types.
> Note that a
> + * pass-by-value representation for abbreviated values is forbidden, but
> + * that's a distinct, generic restriction imposed by the SortSupport
> + * contract.
>
> I think that you have not written what you meant to write here. I
> think what you mean is "Note that a pass-by-REFERENCE representation
> for abbreviated values is forbidden...".
You're right. Sorry about that.
> + /*
> + * If we produced only one initial run (quite likely if the total data
> + * volume is between 1X and 2X workMem), we can just use that
> tape as the
> + * finished output, rather than doing a useless merge. (This obvious
> + * optimization is not in Knuth's algorithm.)
> + */
> + if (state->currentRun == 1)
> + {
> + state->result_tape = state->tp_tapenum[state->destTape];
> + /* must freeze and rewind the finished output tape */
> + LogicalTapeFreeze(state->tapeset, state->result_tape);
> + state->status = TSS_SORTEDONTAPE;
> + return;
> + }
>
> I don't understand the point of moving this code. If there's some
> reason to do this after rewinding the tapes rather than beforehand, I
> think we should articulate that reason in the comment block.
I thought that was made clear by the 0001 commit message. Think of
what happens when we don't disable abbreviated in the final
TSS_SORTEDONTAPE phase should the "if (state->currentRun == 1)" path
have been taken (but *not* the path that also ends in
TSS_SORTEDONTAPE, when caller requires randomAccess but we spill to
tape, or any other case). What happens is: The code in 0002 gets
confused, and attempts to pass back a pointer value as an "abbreviated
key". That's a bug.
> The last hunk in your 0001 patch properly belongs in 0002.
You could certainly argue that the last hunk of 0001 belongs in 0002.
I only moved it to 0001 when I realized that we might as well keep the
branches in sync, since the ordering is insignificant from a 9.5
perspective (although it might still be tidier), and there is a need
to backpatch anyway. I'm not insisting on doing it that way.
--
Peter Geoghegan