Re: enable_incremental_sort changes query behavior

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: enable_incremental_sort changes query behavior
Дата
Msg-id CA+Tgmoa3oNvLBio62jSGXCRhOotsofiqn9guus3+NTQvz6qPWA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: enable_incremental_sort changes query behavior  (James Coleman <jtc331@gmail.com>)
Ответы Re: enable_incremental_sort changes query behavior  (James Coleman <jtc331@gmail.com>)
Re: enable_incremental_sort changes query behavior  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote:
> The plan (obtained by replacing the volatile function with a stable one):
>
>  Unique
>    ->  Nested Loop
>          ->  Gather Merge
>                Workers Planned: 2
>                ->  Sort
>                      Sort Key: ref_0.i, (md5(ref_0.t))
>                      ->  Parallel Index Scan using ref_0_i_idx on ref_0
>          ->  Function Scan on generate_series ref_1
>
> Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.

As far as I remember this stuff, target lists for any nodes below the
top of the join tree were previously always just Var nodes. Any
expressions that needed to be computed were computed at the level of
the topmost join, before adding upper planner steps like Agg,
WindowAgg, Limit, etc. But, here you've got a complex expression --
md5(ref_0.t) -- begin computed somewhere in the middle of the plan
tree so that the sort can be performed at that point. However, that
seems likely to break a bunch of assumptions all over the planner,
because, unless a lot of work and surgery has been done since I looked
at this last, that md5(ref_0.t) value is not going to "bubble up" to
higher levels of the plan through the tlists of the individual nodes.
That poses a risk that it will be recomputed multiple times, which is
particularly bad for volatile functions because you might not always
get the same answer, but it seems like it might be bad even for
non-volatile functions, because it could be slow if the value is used
at multiple places in the query. It feels like the sort of thing that
might have broken some other assumptions, too, although I can't say
exactly what.

I wonder if whatever part of the incremental sort patch allowed
computation of tlist expressions below the top level of the join tree
ought to just be reverted outright, but that might be overkill. I
can't say that I really understand exactly what's going on here, but
it seems like a pretty significant behavior change to make as part of
another project, and I wonder if we're going to keep finding other
problems with it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Partitionwise join fails under GEQO
Следующее
От: James Coleman
Дата:
Сообщение: Re: enable_incremental_sort changes query behavior