Обсуждение: Deparsing rewritten query

Поиск
Список
Период
Сортировка

Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

I sometimes have to deal with queries referencing multiple and/or complex
views.  In such cases, it's quite troublesome to figure out what is the query
really executed.  Debug_print_rewritten isn't really useful for non trivial
queries, and manually doing the view expansion isn't great either.

While not being ideal, I wouldn't mind using a custom extension for that but
this isn't an option as get_query_def() is private and isn't likely to change.

As an alternative, maybe we could expose a simple SRF that would take care of
rewriting the query and deparsing the resulting query tree(s)?

I'm attaching a POC patch for that, adding a new pg_get_query_def(text) SRF.

Usage example:

SELECT pg_get_query_def('SELECT * FROM shoe') as def;
                          def
--------------------------------------------------------
  SELECT shoename,                                     +
     sh_avail,                                         +
     slcolor,                                          +
     slminlen,                                         +
     slminlen_cm,                                      +
     slmaxlen,                                         +
     slmaxlen_cm,                                      +
     slunit                                            +
    FROM ( SELECT sh.shoename,                         +
             sh.sh_avail,                              +
             sh.slcolor,                               +
             sh.slminlen,                              +
             (sh.slminlen * un.un_fact) AS slminlen_cm,+
             sh.slmaxlen,                              +
             (sh.slmaxlen * un.un_fact) AS slmaxlen_cm,+
             sh.slunit                                 +
            FROM shoe_data sh,                         +
             unit un                                   +
           WHERE (sh.slunit = un.un_name)) shoe;       +

(1 row)

Вложения

Re: Deparsing rewritten query

От
Pavel Stehule
Дата:


ne 27. 6. 2021 v 6:11 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

I sometimes have to deal with queries referencing multiple and/or complex
views.  In such cases, it's quite troublesome to figure out what is the query
really executed.  Debug_print_rewritten isn't really useful for non trivial
queries, and manually doing the view expansion isn't great either.

While not being ideal, I wouldn't mind using a custom extension for that but
this isn't an option as get_query_def() is private and isn't likely to change.

As an alternative, maybe we could expose a simple SRF that would take care of
rewriting the query and deparsing the resulting query tree(s)?

I'm attaching a POC patch for that, adding a new pg_get_query_def(text) SRF.

Usage example:

SELECT pg_get_query_def('SELECT * FROM shoe') as def;
                          def
--------------------------------------------------------
  SELECT shoename,                                     +
     sh_avail,                                         +
     slcolor,                                          +
     slminlen,                                         +
     slminlen_cm,                                      +
     slmaxlen,                                         +
     slmaxlen_cm,                                      +
     slunit                                            +
    FROM ( SELECT sh.shoename,                         +
             sh.sh_avail,                              +
             sh.slcolor,                               +
             sh.slminlen,                              +
             (sh.slminlen * un.un_fact) AS slminlen_cm,+
             sh.slmaxlen,                              +
             (sh.slmaxlen * un.un_fact) AS slmaxlen_cm,+
             sh.slunit                                 +
            FROM shoe_data sh,                         +
             unit un                                   +
           WHERE (sh.slunit = un.un_name)) shoe;       +

(1 row)

+1

Pavel

Re: Deparsing rewritten query

От
Ranier Vilela
Дата:
> Hi,
>
> I sometimes have to deal with queries referencing multiple and/or complex
> views. In such cases, it's quite troublesome to figure out what is the
> query
> really executed. Debug_print_rewritten isn't really useful for non trivial
> queries, and manually doing the view expansion isn't great either.
>
> While not being ideal, I wouldn't mind using a custom extension for that
> but
> this isn't an option as get_query_def() is private and isn't likely to
> change.
>
> As an alternative, maybe we could expose a simple SRF that would take care
> of
> rewriting the query and deparsing the resulting query tree(s)?
>
> I'm attaching a POC patch for that, adding a new pg_get_query_def(text)
> SRF.
+1

If you don't mind, I made small corrections to your patch.
1. strcmp(sql, "") could not be replaced by sql[0] == '\0'?
2. the error message would not be: "empty statement not allowed"?
3. initStringInfo(&buf) inside a loop, wouldn't it be exaggerated? each time call palloc0.

regards,
Ranier Vilela

Вложения

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
On Sun, Jun 27, 2021 at 10:06:00AM -0300, Ranier Vilela wrote:
> 
> 1. strcmp(sql, "") could not be replaced by sql[0] == '\0'?

It's a debugging leftover that I forgot to remove.  There's no point trying
to catch an empty string as e.g. a single space would be exactly the same, and
both would be caught by the next (and correct) test.

> 3. initStringInfo(&buf) inside a loop, wouldn't it be exaggerated? each
> time call palloc0.

initStringInfo calls palloc, not palloc0.

It's unlikely to make any difference.  Rules have been strongly discouraged for
many years, and if you have enough to make a noticeable difference here, you
probably have bigger problems.  But no objection to reset the StringInfo
instead.



Re: Deparsing rewritten query

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> As an alternative, maybe we could expose a simple SRF that would take care of
> rewriting the query and deparsing the resulting query tree(s)?

I'm not really excited by this, as it seems like it's exposing internal
decisions we could change someday; to wit, first that there is any such
thing as a separate rewriting pass, and second that its output is
interpretable as pure SQL.  (TBH, I'm not 100% sure that the second
assumption is true even today, although I know there are ancient comments
that claim that.)  It's not very hard to imagine someday moving view
expansion into the planner on efficiency grounds, leaving the rewriter
handling only the rare uses of INSERT/UPDATE/DELETE rules.

If there's functions in ruleutils.c that we'd need to make public to
let somebody write a debugging extension that does this kind of thing,
I'd be happier with that approach than with creating a core-server SQL
function for it.  There might be more than one use-case for the
exposed bits.

            regards, tom lane



Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
On Sun, Jun 27, 2021 at 10:34:52AM -0400, Tom Lane wrote:
> 
> I'm not really excited by this, as it seems like it's exposing internal
> decisions we could change someday; to wit, first that there is any such
> thing as a separate rewriting pass

Sure, but the fact that views will significantly impact the query being
executed from the one written is not an internal decision.  In my opinion
knowing what the final "real" query will be is still a valid concern, whether
we have a rewriting pass or not.

> and second that its output is
> interpretable as pure SQL.  (TBH, I'm not 100% sure that the second
> assumption is true even today, although I know there are ancient comments
> that claim that.)

I totally agree.  Note that there was at least one gotcha handled in this
patch: rewritten views didn't get an alias, which is mandatory for an SQL
query.

> It's not very hard to imagine someday moving view
> expansion into the planner on efficiency grounds, leaving the rewriter
> handling only the rare uses of INSERT/UPDATE/DELETE rules.

Agreed.  One the other hand having such a function in core may ensure that any
significant change in those area will keep an API to retrieve the final query
representation.

> If there's functions in ruleutils.c that we'd need to make public to
> let somebody write a debugging extension that does this kind of thing,
> I'd be happier with that approach than with creating a core-server SQL
> function for it.  There might be more than one use-case for the
> exposed bits.

It would mean exposing at least get_query_def().  I thought that exposing this
function was already suggested and refused, but I may be wrong.  Maybe other
people would like to have nearby functions exposed too.

Note that if we go this way, we would still need at least something like this
patch's chunk in rewriteHandler.c applied, as otherwise the vast majority of
rewritten and deparsed queries won't be valid.



Re: Deparsing rewritten query

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sun, Jun 27, 2021 at 10:34:52AM -0400, Tom Lane wrote:
>> It's not very hard to imagine someday moving view
>> expansion into the planner on efficiency grounds, leaving the rewriter
>> handling only the rare uses of INSERT/UPDATE/DELETE rules.

> Agreed.  One the other hand having such a function in core may ensure that any
> significant change in those area will keep an API to retrieve the final query
> representation.

My point is precisely that I'm unwilling to make such a promise.

I do not buy that this capability is worth very much, given that
we've gotten along fine without it for twenty-plus years.  If you
want to have it as an internal, might-change-at-any-time API,
that seems all right.  If you're trying to lock it down as something
that will be there forevermore, you're likely to end up with nothing.

            regards, tom lane



Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
On Sun, Jun 27, 2021 at 11:14:05AM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> 
> > Agreed.  One the other hand having such a function in core may ensure that any
> > significant change in those area will keep an API to retrieve the final query
> > representation.
> 
> My point is precisely that I'm unwilling to make such a promise.
> 
> I do not buy that this capability is worth very much, given that
> we've gotten along fine without it for twenty-plus years.  If you
> want to have it as an internal, might-change-at-any-time API,
> that seems all right.

I'm totally fine with a might-change-at-any-time API, but not with a
might-disappear-at-anytime API.  If exposing get_query_def() can become
virtually useless in a few releases with no hope to get required in-core change
for retrieving the final query representation, I agree this we can stop the
discussion here.



Re: Deparsing rewritten query

От
Gilles Darold
Дата:
Le 27/06/2021 à 17:14, Tom Lane a écrit :
Julien Rouhaud <rjuju123@gmail.com> writes:
On Sun, Jun 27, 2021 at 10:34:52AM -0400, Tom Lane wrote:
It's not very hard to imagine someday moving view
expansion into the planner on efficiency grounds, leaving the rewriter
handling only the rare uses of INSERT/UPDATE/DELETE rules.
Agreed.  One the other hand having such a function in core may ensure that any
significant change in those area will keep an API to retrieve the final query
representation.
My point is precisely that I'm unwilling to make such a promise.

I do not buy that this capability is worth very much, given that
we've gotten along fine without it for twenty-plus years.  If you
want to have it as an internal, might-change-at-any-time API,
that seems all right.  If you're trying to lock it down as something
that will be there forevermore, you're likely to end up with nothing.
			regards, tom lane


I have to say that such feature would be very helpful for some DBA and especially migration work. The problem is when you have tons of views that call other views in the from or join clauses. These views also call other views, etc. I have had instances where there were up to 25 nested views calls. When you want to clean up this kind of code, using get_query_def () will help save a lot of manual rewrite time and headache to get the final code executed.


If we could at least call get_query_def() through an extension if we didn't have a function it would be ideal for DBAs. I agree this is unusual but when it does happen to you being able to call get_query_def () helps a lot.

-- 
Gilles Darold
http://www.darold.net/

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Thanks for the feedback Gilles!

On Mon, Jun 28, 2021 at 04:06:54PM +0200, Gilles Darold wrote:
> 
> If we could at least call get_query_def()through an extension if we didn't
> have a functionit would be ideal for DBAs.I agree this is unusual but when
> it does happen to you being able to call get_query_def () helps a lot.

Since at least 2 other persons seems to be interested in that feature, I can
take care of writing and maintaining such an extension, provided that the
required infrastructure is available in core.

PFA v2 of the patch which only adds the required alias and expose
get_query_def().

Вложения

Re: Deparsing rewritten query

От
Gilles Darold
Дата:
Le 28/06/2021 à 18:41, Julien Rouhaud a écrit :
> Thanks for the feedback Gilles!
>
> On Mon, Jun 28, 2021 at 04:06:54PM +0200, Gilles Darold wrote:
>> If we could at least call get_query_def()through an extension if we didn't
>> have a functionit would be ideal for DBAs.I agree this is unusual but when
>> it does happen to you being able to call get_query_def () helps a lot.
> Since at least 2 other persons seems to be interested in that feature, I can
> take care of writing and maintaining such an extension, provided that the
> required infrastructure is available in core.
>
> PFA v2 of the patch which only adds the required alias and expose
> get_query_def().


Thanks a lot Julien, such facilities are really helpful for DBAs and
make the difference with other RDBMS. I don't think that this feature
exists else where.


--
Gilles Darold
http://www.darold.net/





Re: Deparsing rewritten query

От
Pavel Stehule
Дата:
Hi

po 31. 1. 2022 v 15:37 odesílatel Gilles Darold <gilles@darold.net> napsal:
Le 28/06/2021 à 18:41, Julien Rouhaud a écrit :
> Thanks for the feedback Gilles!
>
> On Mon, Jun 28, 2021 at 04:06:54PM +0200, Gilles Darold wrote:
>> If we could at least call get_query_def()through an extension if we didn't
>> have a functionit would be ideal for DBAs.I agree this is unusual but when
>> it does happen to you being able to call get_query_def () helps a lot.
> Since at least 2 other persons seems to be interested in that feature, I can
> take care of writing and maintaining such an extension, provided that the
> required infrastructure is available in core.
>
> PFA v2 of the patch which only adds the required alias and expose
> get_query_def().

I checked the last patch.  I think it is almost trivial. I miss just comment, why this alias is necessary

+ if (!rte->alias)
+ rte->alias = makeAlias(get_rel_name(rte->relid), NULL);

Regards

Pavel




Thanks a lot Julien, such facilities are really helpful for DBAs and
make the difference with other RDBMS. I don't think that this feature
exists else where.


--
Gilles Darold
http://www.darold.net/






Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

On Mon, Jan 31, 2022 at 06:46:37PM +0100, Pavel Stehule wrote:
> 
> I checked the last patch.  I think it is almost trivial. I miss just
> comment, why this alias is necessary
> 
> + if (!rte->alias)
> + rte->alias = makeAlias(get_rel_name(rte->relid), NULL);

Thanks for looking at it Pavel!

The alias is necessary because otherwise queries involving views won't produce
valid SQL, as aliases for subquery is mandatory.  This was part of the v1
regression tests:

+-- test pg_get_query_def()
+SELECT pg_get_query_def('SELECT * FROM shoe') as def;
+                          def
+--------------------------------------------------------
+  SELECT shoename,                                     +
+     sh_avail,                                         +
+     slcolor,                                          +
+     slminlen,                                         +
+     slminlen_cm,                                      +
+     slmaxlen,                                         +
+     slmaxlen_cm,                                      +
+     slunit                                            +
+    FROM ( SELECT sh.shoename,                         +
+             sh.sh_avail,                              +
+             sh.slcolor,                               +
+             sh.slminlen,                              +
+             (sh.slminlen * un.un_fact) AS slminlen_cm,+
+             sh.slmaxlen,                              +
+             (sh.slmaxlen * un.un_fact) AS slmaxlen_cm,+
+             sh.slunit                                 +
+            FROM shoe_data sh,                         +
+             unit un                                   +
+           WHERE (sh.slunit = un.un_name)) shoe;       +

the mandatory "shoe" alias is added with that change.

I looked for other similar problems and didn't find anything, but given the
complexity of the SQL standard it's quite possible that I missed some other
corner case.



Re: Deparsing rewritten query

От
Pavel Stehule
Дата:


po 31. 1. 2022 v 19:09 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Mon, Jan 31, 2022 at 06:46:37PM +0100, Pavel Stehule wrote:
>
> I checked the last patch.  I think it is almost trivial. I miss just
> comment, why this alias is necessary
>
> + if (!rte->alias)
> + rte->alias = makeAlias(get_rel_name(rte->relid), NULL);

Thanks for looking at it Pavel!

The alias is necessary because otherwise queries involving views won't produce
valid SQL, as aliases for subquery is mandatory.  This was part of the v1
regression tests:

+-- test pg_get_query_def()
+SELECT pg_get_query_def('SELECT * FROM shoe') as def;
+                          def
+--------------------------------------------------------
+  SELECT shoename,                                     +
+     sh_avail,                                         +
+     slcolor,                                          +
+     slminlen,                                         +
+     slminlen_cm,                                      +
+     slmaxlen,                                         +
+     slmaxlen_cm,                                      +
+     slunit                                            +
+    FROM ( SELECT sh.shoename,                         +
+             sh.sh_avail,                              +
+             sh.slcolor,                               +
+             sh.slminlen,                              +
+             (sh.slminlen * un.un_fact) AS slminlen_cm,+
+             sh.slmaxlen,                              +
+             (sh.slmaxlen * un.un_fact) AS slmaxlen_cm,+
+             sh.slunit                                 +
+            FROM shoe_data sh,                         +
+             unit un                                   +
+           WHERE (sh.slunit = un.un_name)) shoe;       +

the mandatory "shoe" alias is added with that change.

I looked for other similar problems and didn't find anything, but given the
complexity of the SQL standard it's quite possible that I missed some other
corner case.

I don't feel good about forcing an alias. relname doesn't ensure uniqueness. You can have two views with the same name from different schemas. Moreover this field is necessary only when a deparsed query is printed, not always.

Isn't possible to compute the correct subquery alias in print time when it is missing?

Regards

Pavel
 

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

On Mon, Jan 31, 2022 at 10:05:44PM +0100, Pavel Stehule wrote:
> 
> I don't feel good about forcing an alias. relname doesn't ensure
> uniqueness. You can have two views with the same name from different
> schemas. Moreover this field is necessary only when a deparsed query is
> printed, not always.

Yes I agree.

> Isn't possible to compute the correct subquery alias in print time when it
> is missing?

Actually I think that the current code already does everything to generate
unique refnames, it's just that they don't get printed for a query after view
expansions.  I modified the patch to simply make sure that an alias is
displayed when it's a subquery and the output using a custom pg_get_query_def
is like that:

# select  pg_get_query_def('select * from nsp1.v1');
       pg_get_query_def
-------------------------------
  SELECT nb                   +
    FROM ( SELECT 1 AS nb) v1;+

(1 row)


# select  pg_get_query_def('select * from nsp1.v1, nsp2.v1');
       pg_get_query_def
-------------------------------
  SELECT v1.nb,               +
     v1_1.nb                  +
    FROM ( SELECT 1 AS nb) v1,+
     ( SELECT 1 AS nb) v1_1;  +

(1 row)

Вложения

Re: Deparsing rewritten query

От
Pavel Stehule
Дата:


út 1. 2. 2022 v 4:38 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Mon, Jan 31, 2022 at 10:05:44PM +0100, Pavel Stehule wrote:
>
> I don't feel good about forcing an alias. relname doesn't ensure
> uniqueness. You can have two views with the same name from different
> schemas. Moreover this field is necessary only when a deparsed query is
> printed, not always.

Yes I agree.

> Isn't possible to compute the correct subquery alias in print time when it
> is missing?

Actually I think that the current code already does everything to generate
unique refnames, it's just that they don't get printed for a query after view
expansions.  I modified the patch to simply make sure that an alias is
displayed when it's a subquery and the output using a custom pg_get_query_def
is like that:

# select  pg_get_query_def('select * from nsp1.v1');
       pg_get_query_def
-------------------------------
  SELECT nb                   +
    FROM ( SELECT 1 AS nb) v1;+

(1 row)


# select  pg_get_query_def('select * from nsp1.v1, nsp2.v1');
       pg_get_query_def
-------------------------------
  SELECT v1.nb,               +
     v1_1.nb                  +
    FROM ( SELECT 1 AS nb) v1,+
     ( SELECT 1 AS nb) v1_1;  +

(1 row)

I tested your patch, and it looks so it is working without any problem. All tests passed.

There is just one question. If printalias = true will be active for all cases or just with some flag?

I didn't find any visible change of this modification without your function, so maybe it can be active for all cases without any condition.

Regards

Pavel



 

Re: Deparsing rewritten query

От
Bharath Rupireddy
Дата:
On Tue, Feb 1, 2022 at 9:08 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,

Thanks. +1 for this work. Some comments on v3:

1) How about pg_get_rewritten_query()?
2) Docs missing?
3) How about allowing only the users who are explicitly granted to use
this function like pg_log_backend_memory_contexts,
pg_log_query_plan(in discussion), pg_log_backtrace(in discussion)?
4) initStringInfo in the for loop will palloc every time and will leak
the memory. you probably need to do resetStringInfo in the for loop
instead.
+ foreach(lc, querytree_list)
+ {
+ query = (Query *) lfirst(lc);
+ initStringInfo(&buf);
5) I would even suggest using a temp memory context for this function
alone, because it will ensure we dont' leak any memory which probably
parser, analyzer, rewriter would use.
6) Why can't query be for loop variable?
+ Query     *query;
7) Why can't the function check for empty query string and emit error
immedeiately (empty string isn't allowed or some other better error
message), rather than depending on the pg_parse_query?
+ parsetree_list = pg_parse_query(sql);
+
+ /* only support one statement at a time */
+ if (list_length(parsetree_list) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("a single statement should be provided")));
8) Show rewritten query given raw query string
+{ oid => '9246', descr => 'show a query as rewritten',
9) Doesn't the input need a ; at the end of query? not sure if the
parser assumes it as provided?
+SELECT pg_get_query_def('SELECT * FROM shoe') as def;
10) For pg_get_viewdef it makes sense to have the test case in
rules.sql, but shouldn't this function be in misc_functions.sql?
11) Missing bump cat version note in the commit message.
12) I'm just thinking adding an extra option to explain, which will
then print the rewritten query in the explain output, would be useful
than having a separate function to do this?
13) Somebody might also be interested to just get the completed
planned query i.e. output of pg_plan_query? or the other way, given
the query plan as input to a function, can we get the query back?
something like postgres_fdw/deparse.c does?

Regards,
Bharath Rupireddy.



Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

On Wed, Feb 02, 2022 at 07:09:35PM +0530, Bharath Rupireddy wrote:
> On Tue, Feb 1, 2022 at 9:08 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Hi,
> 
> Thanks. +1 for this work. Some comments on v3:
> 
> 1) How about pg_get_rewritten_query()?

Argh, I just realized that I sent the patch from the wrong branch.  Per
previous complaint from Tom, I'm not proposing that function anymore (I will
publish an extension for that if the patch gets commits) but only expose
get_query_def().

I'm attaching the correct patch this time, sorry about that.

Вложения

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

On Tue, Feb 01, 2022 at 08:35:00PM +0100, Pavel Stehule wrote:
> 
> I tested your patch, and it looks so it is working without any problem. All
> tests passed.
> 
> There is just one question. If printalias = true will be active for all
> cases or just with some flag?

Sorry, as I just replied to Bharath I sent the wrong patch.  The new patch has
the same modification with printalias = true though, so I can still answer that
question.  The change is active for all cases, however it's a no-op for any
in-core case, as a query sent by a client should be valid, and thus should have
an alias attached to all subqueries.  It's only different if you call
get_query_def() on the result of pg_analyze_and_rewrite(), since this code
doesn't add the subquery aliases as those aren't needed for the execution part.



Re: Deparsing rewritten query

От
Pavel Stehule
Дата:


st 2. 2. 2022 v 15:18 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Tue, Feb 01, 2022 at 08:35:00PM +0100, Pavel Stehule wrote:
>
> I tested your patch, and it looks so it is working without any problem. All
> tests passed.
>
> There is just one question. If printalias = true will be active for all
> cases or just with some flag?

Sorry, as I just replied to Bharath I sent the wrong patch.  The new patch has
the same modification with printalias = true though, so I can still answer that
question.  The change is active for all cases, however it's a no-op for any
in-core case, as a query sent by a client should be valid, and thus should have
an alias attached to all subqueries.  It's only different if you call
get_query_def() on the result of pg_analyze_and_rewrite(), since this code
doesn't add the subquery aliases as those aren't needed for the execution part.

ok.

I checked this trivial patch, and I don't see any problem. Again I run check-world with success. The documentation for this feature is not necessary. But I am not sure about regress tests. Without any other code, enfosing printalias will be invisible. What do you think about the transformation of your extension to a new module in src/test/modules? Maybe it can be used for other checks in future.

Regards

Pavel

 

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
Hi,

On Wed, Feb 02, 2022 at 07:49:41PM +0100, Pavel Stehule wrote:
>
> I checked this trivial patch, and I don't see any problem. Again I run
> check-world with success. The documentation for this feature is not
> necessary. But I am not sure about regress tests. Without any other code,
> enfosing printalias will be invisible. What do you think about the
> transformation of your extension to a new module in src/test/modules? Maybe
> it can be used for other checks in future.

I'm not opposed, but previously Tom explicitly said that he thinks this feature
is useless and is strongly opposed to making any kind of promise that the
current interface to make it possible (if get_query_def() is exposed) would be
maintained.  Adding such a test module would probably a reason to reject the
patch altogether.  I'm just hoping that this change, which is a no-op for
any legal query, is acceptable.  It can only break something if you feed wrong
data to get_query_def(), which would be my problem and not the project's
problem.



Re: Deparsing rewritten query

От
Pavel Stehule
Дата:


pá 4. 2. 2022 v 10:36 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Wed, Feb 02, 2022 at 07:49:41PM +0100, Pavel Stehule wrote:
>
> I checked this trivial patch, and I don't see any problem. Again I run
> check-world with success. The documentation for this feature is not
> necessary. But I am not sure about regress tests. Without any other code,
> enfosing printalias will be invisible. What do you think about the
> transformation of your extension to a new module in src/test/modules? Maybe
> it can be used for other checks in future.

I'm not opposed, but previously Tom explicitly said that he thinks this feature
is useless and is strongly opposed to making any kind of promise that the
current interface to make it possible (if get_query_def() is exposed) would be
maintained.  Adding such a test module would probably a reason to reject the
patch altogether.  I'm just hoping that this change, which is a no-op for
any legal query, is acceptable.  It can only break something if you feed wrong
data to get_query_def(), which would be my problem and not the project's
problem.

ok, I don't have any problem with it. Then there is not necessarily any change, and I'll mark this patch as ready for committer.

Regards

Pavel
 

Re: Deparsing rewritten query

От
Julien Rouhaud
Дата:
On Fri, Feb 04, 2022 at 12:45:05PM +0100, Pavel Stehule wrote:
> 
> ok, I don't have any problem with it. Then there is not necessarily any
> change, and I'll mark this patch as ready for committer.

Thanks Pavel!

I also realized that the CF entry description wasn't accurate anymore, so I
also fixed that.



Re: Deparsing rewritten query

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> I'm attaching the correct patch this time, sorry about that.

While I'm okay with this in principle, as it stands it fails
headerscheck/cpluspluscheck:

$ src/tools/pginclude/headerscheck
In file included from /tmp/headerscheck.Oi8jj3/test.c:2:
./src/include/utils/ruleutils.h:42:35: error: unknown type name 'StringInfo'; did you mean 'String'?
 void  get_query_def(Query *query, StringInfo buf, List *parentnamespace,
                                   ^~~~~~~~~~
                                   String
./src/include/utils/ruleutils.h:43:9: error: unknown type name 'TupleDesc'
         TupleDesc resultDesc,
         ^~~~~~~~~

We could of course add the necessary #include's to ruleutils.h,
but considering that we seem to have been at some pains to minimize
its #include footprint, I'm not really happy with that approach.
I'm inclined to think that maybe a wrapper function with a slightly
simplified interface would be a better way to go.  The deparsed string
could just be returned as a palloc'd "char *", unless you have some reason
to need it to be in a StringInfo.  I wonder which of the other parameters
really need to be exposed, too.  Several of them seem to be more internal
to ruleutils.c than something that outside callers would care to
manipulate.  For instance, since struct deparse_namespace isn't exposed,
there really isn't any way to pass anything except NIL for
parentnamespace.

The bigger picture here is that get_query_def's API has changed over time
internally to ruleutils.c, and I kind of expect that that might continue
in future, so having a wrapper function with a more stable API could be a
good thing.

            regards, tom lane



Add a pg_get_query_def function (was Re: Deparsing rewritten query)

От
Julien Rouhaud
Дата:
On Fri, Mar 25, 2022 at 05:49:04PM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > I'm attaching the correct patch this time, sorry about that.
> 
> While I'm okay with this in principle, as it stands it fails
> headerscheck/cpluspluscheck:
> 
> $ src/tools/pginclude/headerscheck 
> In file included from /tmp/headerscheck.Oi8jj3/test.c:2:
> ./src/include/utils/ruleutils.h:42:35: error: unknown type name 'StringInfo'; did you mean 'String'?
>  void  get_query_def(Query *query, StringInfo buf, List *parentnamespace,
>                                    ^~~~~~~~~~
>                                    String
> ./src/include/utils/ruleutils.h:43:9: error: unknown type name 'TupleDesc'
>          TupleDesc resultDesc,
>          ^~~~~~~~~

Ah thanks for the info.  I actually never tried headerscheck/cplupluscheck
before.

> We could of course add the necessary #include's to ruleutils.h,
> but considering that we seem to have been at some pains to minimize
> its #include footprint, I'm not really happy with that approach.
> I'm inclined to think that maybe a wrapper function with a slightly
> simplified interface would be a better way to go.  The deparsed string
> could just be returned as a palloc'd "char *", unless you have some reason
> to need it to be in a StringInfo.  I wonder which of the other parameters
> really need to be exposed, too.  Several of them seem to be more internal
> to ruleutils.c than something that outside callers would care to
> manipulate.  For instance, since struct deparse_namespace isn't exposed,
> there really isn't any way to pass anything except NIL for
> parentnamespace.
> 
> The bigger picture here is that get_query_def's API has changed over time
> internally to ruleutils.c, and I kind of expect that that might continue
> in future, so having a wrapper function with a more stable API could be a
> good thing.

Fair point.  That's a much better approach and goes well with the rest of the
exposed functions in that file.  I went with a pg_get_querydef, getting rid of
the StringInfo and the List and using the same "bool pretty" flag as used
elsewhere.  While doing so, I saw that there were a lot of copy/pasted code for
the pretty flags, so I added a GET_PRETTY_FLAGS(pretty) macro to avoid adding
yet another occurrence.  I also kept the wrapColument and startIdent as they
can be easily used by callers.

Вложения

Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> [ v4-0001-Add-a-pg_get_query_def-wrapper-around-get_query_d.patch ]

This seems about ready to go to me, except for

(1) as an exported API, pg_get_querydef needs a full API spec in its
header comment.  "Read the code to figure out what to do" is not OK
in my book.

(2) I don't think this has been thought out too well:

> I also kept the wrapColument and startIdent as they
> can be easily used by callers.

The appropriate values for these are determined by macros that are
local in ruleutils.c, so it's not that "easy" for outside callers
to conform to standard practice.  I suppose we could move
WRAP_COLUMN_DEFAULT etc into ruleutils.h, but is there actually a
use-case for messing with those?  I don't see any other exported
functions that go beyond offering a "bool pretty" option, so
I think it might be a mistake for this one to be different.
(The pattern that I see is that a ruleutils function could have
"bool pretty", or it could have "int prettyFlags, int startIndent"
which is an expansion of that; but mixing those levels of detail
doesn't seem very wise.)

            regards, tom lane



Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

От
Julien Rouhaud
Дата:
On Sun, Mar 27, 2022 at 11:53:58AM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > [ v4-0001-Add-a-pg_get_query_def-wrapper-around-get_query_d.patch ]
>
> This seems about ready to go to me, except for
>
> (1) as an exported API, pg_get_querydef needs a full API spec in its
> header comment.  "Read the code to figure out what to do" is not OK
> in my book.

Fixed.

> (2) I don't think this has been thought out too well:
>
> > I also kept the wrapColument and startIdent as they
> > can be easily used by callers.
>
> The appropriate values for these are determined by macros that are
> local in ruleutils.c, so it's not that "easy" for outside callers
> to conform to standard practice.  I suppose we could move
> WRAP_COLUMN_DEFAULT etc into ruleutils.h, but is there actually a
> use-case for messing with those?

As far as I can see the wrapColumn and startIndent are independant of the
pretty flags, and don't really have magic numbers for those
(WRAP_COLUMN_DEFAULT sure exists, but the value isn't really surprising).

> I don't see any other exported
> functions that go beyond offering a "bool pretty" option, so
> I think it might be a mistake for this one to be different.

There's the SQL function pg_get_viewdef_wrap() that accept a custom wrapColumn.

That being said I'm totally ok with just exposing a "pretty" parameter and use
WRAP_COLUMN_DEFAULT.  In any case I agree that exposing startIndent doesn't
serve any purpose.

I'm attaching a v5 with hopefully a better comment for the function, and only
the "pretty" parameter.

Вложения

Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> I'm attaching a v5 with hopefully a better comment for the function, and only
> the "pretty" parameter.

Pushed with some minor cosmetic adjustments.

            regards, tom lane



Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

От
Julien Rouhaud
Дата:
On Mon, Mar 28, 2022 at 11:20:42AM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > I'm attaching a v5 with hopefully a better comment for the function, and only
> > the "pretty" parameter.
> 
> Pushed with some minor cosmetic adjustments.

Thanks a lot!

I just published an extension using this for the use case I'm interested in.