Обсуждение: Obtaining a more consistent view definition when a UNION subquerycontains undecorated constants

Поиск
Список
Период
Сортировка

Obtaining a more consistent view definition when a UNION subquerycontains undecorated constants

От
Jimmy Yih
Дата:
Hello,

A colleague and I were playing around with dumping views and found an inconsistency for a view definition involving subqueries and undecorated constants in a UNION.  When we took that view definition and restored it, dumping the view gave different syntax again.  Although the slightly different view definitions were semantically the same, it was weird to see the syntax difference.

Our view SQL where 'bar' constant is not decorated:

CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz UNION SELECT 0 AS a, 'bar')t;
// view definition from pg_get_viewdef()
 SELECT t.a
   FROM ( SELECT 1 AS a,
            'foo'::text AS baz
        UNION
         SELECT 0 AS a,
            'bar'::text) t;

Note that the type decorator is appended to 'bar' in the normal fashion.

Then taking the above view definition, and creating a view,

CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz UNION SELECT 0 AS a, 'bar'::text)t;
// view definition from pg_get_viewdef()
 SELECT t.a
   FROM ( SELECT 1 AS a,
            'foo'::text AS baz
        UNION
         SELECT 0 AS a,
            'bar'::text AS text) t;

results in a view definition that has the alias 'AS text' appended to 'bar'::text.

Contrast this to creating a view without the subquery:

CREATE OR REPLACE VIEW fooview AS SELECT 1 AS a, 'foo'::text AS baz UNION SELECT 0 AS a, 'bar';
// view definition from pg_get_viewdef()
 SELECT 1 AS a,
    'foo'::text AS baz
UNION
 SELECT 0 AS a,
    'bar'::text AS baz;

We see that this view will use the view's tuple descriptor to name the columns.

Looking at the internal code (mostly get_from_clause_item() function), we saw that when a subquery is used, there is no tuple descriptor passed to get_query_def() function. Because of this, get_target_list() uses the resnames obtained from the pg_rewrite entry's ev_action field.  However, it seems to be fairly simple to construct a dummy tuple descriptor from the ev_action to pass down the call stack so that the column names will be consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes involving a UNION.  Attached is a patch that demonstrates this.

Running with the attached patch, it seems to work pretty well:

postgres=# CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz UNION SELECT 0 AS a, 'bar')t;
postgres=# select pg_get_viewdef('fooview');
           pg_get_viewdef
------------------------------------
  SELECT t.a
    FROM ( SELECT 1 AS a,
             'foo'::text AS baz
         UNION
          SELECT 0 AS a,
             'bar'::text AS baz) t;
(1 row)

postgres=# CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz UNION SELECT 0 AS a, 'bar'::text)t;
postgres=# select pg_get_viewdef('fooview');
           pg_get_viewdef
------------------------------------
  SELECT t.a
    FROM ( SELECT 1 AS a,
             'foo'::text AS baz
         UNION
          SELECT 0 AS a,
             'bar'::text AS baz) t;
(1 row)

Nested subqueries also work with the patch.  We're not sure how this could break.

Is this an acceptable change that should be pursued?

Regards,
Jimmy Yih and Jim Doty
Вложения

Re: Obtaining a more consistent view definition when a UNION subquery contains undecorated constants

От
Tom Lane
Дата:
Jimmy Yih <jyih@pivotal.io> writes:
> A colleague and I were playing around with dumping views and found an
> inconsistency for a view definition involving subqueries and undecorated
> constants in a UNION.  When we took that view definition and restored it,
> dumping the view gave different syntax again.  Although the slightly
> different view definitions were semantically the same, it was weird to see
> the syntax difference.

Yeah, this is a consequence of deciding to emit the explicit cast even
though there was none there to begin with.  In many cases we have to do
that to ensure that the view will re-parse with the same semantics, but
I wonder if it would be possible to suppress it for top-level SELECT
list items.  That would get rid of both stages of the apparent view
mutation not just one.  The question that requires thought is whether
the view could ever re-parse differently, but offhand it seems like it
could not: there's only one rule for what an implicitly typed constant
could become in that situation.

> Looking at the internal code (mostly get_from_clause_item() function), we
> saw that when a subquery is used, there is no tuple descriptor passed to
> get_query_def() function. Because of this, get_target_list() uses the
> resnames obtained from the pg_rewrite entry's ev_action field.  However, it
> seems to be fairly simple to construct a dummy tuple descriptor from the
> ev_action to pass down the call stack so that the column names will be
> consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes
> involving a UNION.  Attached is a patch that demonstrates this.

I'm afraid that this just moves the weird cases somewhere else, ie
you might see an AS clause where you had none before, or a different
AS clause from what you originally wrote.  The parser has emitted the
same parse tree as if there were an explicit AS there; I doubt that
we want ruleutils to second-guess that unless it really has to.

            regards, tom lane


Re: Obtaining a more consistent view definition when a UNION subquerycontains undecorated constants

От
Jacob Champion
Дата:
On Thu, Sep 27, 2018 at 3:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jimmy Yih <jyih@pivotal.io> writes:
> > Looking at the internal code (mostly get_from_clause_item() function), we
> > saw that when a subquery is used, there is no tuple descriptor passed to
> > get_query_def() function. Because of this, get_target_list() uses the
> > resnames obtained from the pg_rewrite entry's ev_action field.  However, it
> > seems to be fairly simple to construct a dummy tuple descriptor from the
> > ev_action to pass down the call stack so that the column names will be
> > consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes
> > involving a UNION.  Attached is a patch that demonstrates this.
>
> I'm afraid that this just moves the weird cases somewhere else, ie
> you might see an AS clause where you had none before, or a different
> AS clause from what you originally wrote.  The parser has emitted the
> same parse tree as if there were an explicit AS there; I doubt that
> we want ruleutils to second-guess that unless it really has to.

Can you give a quick example of something that "breaks" with this
approach? I think we're having trouble seeing it.

It might help to have some additional context on why we care: this
problem shows up in pg_upgrade testing, since we're basically
performing the following steps:
- create a view in the old database
- dump the old database schema
- load the schema into the new database
- dump the new database schema and compare to the old one

So from this perspective, we don't mind so much if the view definition
changes between creation and dump, but we do mind if it changes a
second time after the dump has been restored, since it shows up as a
false negative in the diff. In other words, we'd like the dumped view
definitions to be "stable" with respect to dumps and restores.

Thanks,
--Jacob