Обсуждение: BUG #16121: 12 regression: Volatile function in target list subquery behave as stable

Поиск
Список
Период
Сортировка

BUG #16121: 12 regression: Volatile function in target list subquery behave as stable

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16121
Logged by:          Elvis Pranskevichus
Email address:      elprans@gmail.com
PostgreSQL version: 12.1
Operating system:   Gentoo Linux
Description:

There seems to be a regression in Postgres 12 related to how volatile
functions are handled. If put inside a subquery in the target list, the
volatile function seems to behave like a stable one:

volatility=# SELECT
   v.i,
   (SELECT random() from (values (v.i)) as q) AS v1
FROM
   generate_series(1, 2) as v(i);
 i │         v1
───┼────────────────────
 1 │ 0.8303615124282295
 2 │ 0.8303615124282295
(2 rows)

Postgres 11 and earlier worked correctly:

volatility=# SELECT
   v.i,
   (SELECT random() from (values (v.i)) as q) AS v1
FROM
   generate_series(1, 2) as v(i);
 i │        v1
───┼───────────────────
 1 │ 0.137472071684897
 2 │ 0.288798084016889
(2 rows)


Re: BUG #16121: 12 regression: Volatile function in target list subquery behave as stable

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> There seems to be a regression in Postgres 12 related to how volatile
> functions are handled. If put inside a subquery in the target list, the
> volatile function seems to behave like a stable one:

> volatility=# SELECT
>    v.i,
>    (SELECT random() from (values (v.i)) as q) AS v1
> FROM
>    generate_series(1, 2) as v(i);
>  i │         v1
> ───┼────────────────────
>  1 │ 0.8303615124282295
>  2 │ 0.8303615124282295
> (2 rows)

While there's certainly been a change in behavior here, it's hard
to call it a bug, because I do not think we promise anything much
about cases like this.  The reason why v12 is no longer evaluating
random() more than once is that it's successfully simplified the
sub-select to just "(SELECT random())", which is supposed to be
evaluated only once, because it's not correlated to the outer query.

Previous versions failed to do that, or at least failed to do it
early enough to remove the v.i reference before making the decision
as to whether the sub-select is correlated or not.  (Just to check,
I bisected to see where the behavior changed, and found that it was
at commit 4be058fe9, "In the planner, replace an empty FROM clause
with a dummy RTE".  Smarter handling of trivial VALUES items was an
explicit goal of that commit.)

So basically, you're depending on an assumption that the system
won't optimize away the unused reference to v.i, and now it's
started to do so.  I do not wish to regard that as a bug; for
most people affected by this change, it's surely an improvement.

In the given example, you'd be a lot better off skipping the
sub-select altogether and just writing "random() AS v1".
I suppose this is an oversimplified form of some production query,
but we'd need to see something closer to the real query to have
an intelligent discussion about what to do instead.

            regards, tom lane



Re: BUG #16121: 12 regression: Volatile function in target list subquery behave as stable

От
Elvis Pranskevichus
Дата:
On Monday, November 18, 2019 11:47:55 A.M. EST Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
> > There seems to be a regression in Postgres 12 related to how
> > volatile
> > functions are handled. If put inside a subquery in the target list,
> > the volatile function seems to behave like a stable one:
> >
> > volatility=# SELECT
> >
> >    v.i,
> >    (SELECT random() from (values (v.i)) as q) AS v1
> >
> > FROM
> >
> >    generate_series(1, 2) as v(i);
> >
> >  i │         v1
> >
> > ───┼────────────────────
> >
> >  1 │ 0.8303615124282295
> >  2 │ 0.8303615124282295
> >
> > (2 rows)
>
> While there's certainly been a change in behavior here, it's hard
> to call it a bug, because I do not think we promise anything much
> about cases like this.  The reason why v12 is no longer evaluating
> random() more than once is that it's successfully simplified the
> sub-select to just "(SELECT random())", which is supposed to be
> evaluated only once, because it's not correlated to the outer query.
>
> Previous versions failed to do that, or at least failed to do it
> early enough to remove the v.i reference before making the decision
> as to whether the sub-select is correlated or not.  (Just to check,
> I bisected to see where the behavior changed, and found that it was
> at commit 4be058fe9, "In the planner, replace an empty FROM clause
> with a dummy RTE".  Smarter handling of trivial VALUES items was an
> explicit goal of that commit.)
>
> So basically, you're depending on an assumption that the system
> won't optimize away the unused reference to v.i, and now it's
> started to do so.  I do not wish to regard that as a bug; for
> most people affected by this change, it's surely an improvement.
>
> In the given example, you'd be a lot better off skipping the
> sub-select altogether and just writing "random() AS v1".
> I suppose this is an oversimplified form of some production query,
> but we'd need to see something closer to the real query to have
> an intelligent discussion about what to do instead.

Hmm.  We're generating those queries, so hoisting to the outer query is
not always possible, especially when there's cardinality mismatch and
the subquery is aggregated somehow.  I guess we would need to find
another way to convince the optimizer to leave the evaluation of the
volatile call in place.

Thanks,

                               Elvis