Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates
Дата
Msg-id CAM3SWZT94eM-0BFTcaCLcWGCL3ehpAGXFDJqs8M1L8YuquKTfw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Fri, Dec 18, 2015 at 9:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Attached revision updates both the main commit (the optimization), and
>> the backpatch commit (updated the contract).
>
> -       /* abbreviation is possible here only for by-reference types */
> +       /*
> +        * Abbreviation is possible here only for by-reference types.
> Note that a
> +        * pass-by-value representation for abbreviated values is forbidden, but
> +        * that's a distinct, generic restriction imposed by the SortSupport
> +        * contract.
>
> I think that you have not written what you meant to write here.  I
> think what you mean is "Note that a pass-by-REFERENCE representation
> for abbreviated values is forbidden...".

You're right. Sorry about that.

> +       /*
> +        * If we produced only one initial run (quite likely if the total data
> +        * volume is between 1X and 2X workMem), we can just use that
> tape as the
> +        * finished output, rather than doing a useless merge.  (This obvious
> +        * optimization is not in Knuth's algorithm.)
> +        */
> +       if (state->currentRun == 1)
> +       {
> +               state->result_tape = state->tp_tapenum[state->destTape];
> +               /* must freeze and rewind the finished output tape */
> +               LogicalTapeFreeze(state->tapeset, state->result_tape);
> +               state->status = TSS_SORTEDONTAPE;
> +               return;
> +       }
>
> I don't understand the point of moving this code.  If there's some
> reason to do this after rewinding the tapes rather than beforehand, I
> think we should articulate that reason in the comment block.

I thought that was made clear by the 0001 commit message. Think of
what happens when we don't disable abbreviated in the final
TSS_SORTEDONTAPE phase should the "if (state->currentRun == 1)" path
have been taken (but *not* the path that also ends in
TSS_SORTEDONTAPE, when caller requires randomAccess but we spill to
tape, or any other case). What happens is: The code in 0002 gets
confused, and attempts to pass back a pointer value as an "abbreviated
key". That's a bug.

> The last hunk in your 0001 patch properly belongs in 0002.

You could certainly argue that the last hunk of 0001 belongs in 0002.
I only moved it to 0001 when I realized that we might as well keep the
branches in sync, since the ordering is insignificant from a 9.5
perspective (although it might still be tidier), and there is a need
to backpatch anyway. I'm not insisting on doing it that way.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Refactoring speculative insertion with unique indexes a little
Следующее
От: Artur Zakirov
Дата:
Сообщение: Fuzzy substring searching with the pg_trgm extension