Обсуждение: [PATCH] parser: optionally warn about missing AS for column and table aliases

Поиск
Список
Период
Сортировка

[PATCH] parser: optionally warn about missing AS for column and table aliases

От
Oskari Saarenmaa
Дата:
I wrote the attached patch to optionally emit warnings when column or table
aliases are used without the AS keyword after errors caused by typos in
statements turning unintended things into aliases came up twice this week.
First in a discussion with a colleague who was surprised by a 1 row result
for the query 'SELECT COUNT(*) files' and again in the "pl/pgsql 2" thread
as plpgsql currently doesn't throw an error if there are more result columns
than output columns (SELECT a b INTO f1, f2).

The patch is still missing documentation and it needs another patch to
modify all the statements in psql & co to use AS so you can use things like
\d and tab-completion without triggering the warnings.  I can implement
those changes if others think this patch makes sense.

/ Oskari

Вложения

Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Marko Tiikkaja
Дата:
On 2014-09-05 22:38, Oskari Saarenmaa wrote:
> I wrote the attached patch to optionally emit warnings when column or table
> aliases are used without the AS keyword after errors caused by typos in
> statements turning unintended things into aliases came up twice this week.
> First in a discussion with a colleague who was surprised by a 1 row result
> for the query 'SELECT COUNT(*) files' and again in the "pl/pgsql 2" thread
> as plpgsql currently doesn't throw an error if there are more result columns
> than output columns (SELECT a b INTO f1, f2).
>
> The patch is still missing documentation and it needs another patch to
> modify all the statements in psql & co to use AS so you can use things like
> \d and tab-completion without triggering the warnings.  I can implement
> those changes if others think this patch makes sense.

I think this is only problematic for column aliases.  I wouldn't want to 
put these two to be put into the same category, as I always omit the AS 
keyword for tables aliases (and will continue to do so), but never omit 
it for column aliases.


.marko



Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Oskari Saarenmaa
Дата:
05.09.2014 23:45, Marko Tiikkaja kirjoitti:
> On 2014-09-05 22:38, Oskari Saarenmaa wrote:
>> I wrote the attached patch to optionally emit warnings when column or
>> table
>> aliases are used without the AS keyword after errors caused by typos in
>> statements turning unintended things into aliases came up twice this
>> week.
>> First in a discussion with a colleague who was surprised by a 1 row
>> result
>> for the query 'SELECT COUNT(*) files' and again in the "pl/pgsql 2"
>> thread
>> as plpgsql currently doesn't throw an error if there are more result
>> columns
>> than output columns (SELECT a b INTO f1, f2).
>>
>> The patch is still missing documentation and it needs another patch to
>> modify all the statements in psql & co to use AS so you can use things
>> like
>> \d and tab-completion without triggering the warnings.  I can implement
>> those changes if others think this patch makes sense.
>
> I think this is only problematic for column aliases.  I wouldn't want to
> put these two to be put into the same category, as I always omit the AS
> keyword for tables aliases (and will continue to do so), but never omit
> it for column aliases.

I prefer using AS for both, but I can see the case for requiring AS in 
table aliases being a lot weaker.  Not emitting warnings for table 
aliases would also reduce the changes required in psql & co as they seem 
to be using aliases mostly (only?) for tables.

What'd be a good name for the GUC controlling this, 
missing_column_as_warning?

/ Oskari




Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Marko Tiikkaja
Дата:
On 2014-09-05 22:56, Oskari Saarenmaa wrote:
> What'd be a good name for the GUC controlling this,
> missing_column_as_warning?

I don't know about others, but I've previously had the idea of having 
more warnings like this (my go-to example is  DELETE FROM foo WHERE bar 
IN (SELECT bar FROM barlesstable);  where "barlesstable" doesn't have a 
column "bar").  Perhaps it would be better to have a guc with a list of 
warnings, similarly to what we did for plpgsql.extra_warnings?

Now that I start thinking about it, more ideas for such warnings start 
springing up..


.marko



Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Pavel Stehule
Дата:



2014-09-05 23:06 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-05 22:56, Oskari Saarenmaa wrote:
What'd be a good name for the GUC controlling this,
missing_column_as_warning?

I don't know about others, but I've previously had the idea of having more warnings like this (my go-to example is  DELETE FROM foo WHERE bar IN (SELECT bar FROM barlesstable);  where "barlesstable" doesn't have a column "bar").  Perhaps it would be better to have a guc with a list of warnings, similarly to what we did for plpgsql.extra_warnings?

Now that I start thinking about it, more ideas for such warnings start springing up..

I had same idea - maybe some general mode "prune human errors" mode

Pavel
 


.marko



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
David G Johnston
Дата:
Marko Tiikkaja-4 wrote
> On 2014-09-05 22:38, Oskari Saarenmaa wrote:
>> I wrote the attached patch to optionally emit warnings when column or
>> table
>> aliases are used without the AS keyword after errors caused by typos in
>> statements turning unintended things into aliases came up twice this
>> week.
>> First in a discussion with a colleague who was surprised by a 1 row
>> result
>> for the query 'SELECT COUNT(*) files' and again in the "pl/pgsql 2"
>> thread
>> as plpgsql currently doesn't throw an error if there are more result
>> columns
>> than output columns (SELECT a b INTO f1, f2).
>>
>> The patch is still missing documentation and it needs another patch to
>> modify all the statements in psql & co to use AS so you can use things
>> like
>> \d and tab-completion without triggering the warnings.  I can implement
>> those changes if others think this patch makes sense.
> 
> I think this is only problematic for column aliases.  I wouldn't want to 
> put these two to be put into the same category, as I always omit the AS 
> keyword for tables aliases (and will continue to do so), but never omit 
> it for column aliases.

I agree on being able to pick the behavior independently for select-list
items and the FROM clause items.

Using this to mitigate the pl/pgsql column mis-match issue seems like too
broad of a solution.

I probably couldn't mount a convincing defense of my opinion but at first
blush I'd say we should pass on this.  Not with prejudice - looking at the
issue periodically has merit - but right now it seems to be mostly adding
clutter to the code without significant benefit.

Tangential - I'd rather see something like "EXPLAIN (STATIC)" that would
allow a user to quickly invoke a built-in static SQL analyzer on their query
and have it point this kind of thing out.  Adding a catalog for STATIC
configurations in much the same way we have text search configurations would
allow extensions and users to define their own rulesets that could be
attached to their ROLE "GUC: default_static_analyzer_ruleset" and also
passed in as a modifier in the EXPLAIN invocation.

Stuff like correlated subqueries without alias use could be part of that (to
avoid typos that result in constant true) and also duplicate column names
floating to the top-most select-list, or failure to use an alias on a
function call result.  There are probably others though I'm stretching to
even find these...

Anyway, the idea of using a GUC and evaluating every query that is written
(the added if statements), is not that appealing even if the impact of one
more check is likely to be insignificant (is it?)

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PATCH-parser-optionally-warn-about-missing-AS-for-column-and-table-aliases-tp5817971p5817980.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
David G Johnston
Дата:
David G Johnston wrote
> 
> Marko Tiikkaja-4 wrote
>> On 2014-09-05 22:38, Oskari Saarenmaa wrote:
>>> I wrote the attached patch to optionally emit warnings when column or
>>> table
>>> aliases are used without the AS keyword after errors caused by typos in
>>> statements turning unintended things into aliases came up twice this
>>> week.
>>> First in a discussion with a colleague who was surprised by a 1 row
>>> result
>>> for the query 'SELECT COUNT(*) files' and again in the "pl/pgsql 2"
>>> thread
>>> as plpgsql currently doesn't throw an error if there are more result
>>> columns
>>> than output columns (SELECT a b INTO f1, f2).
>>>
>>> The patch is still missing documentation and it needs another patch to
>>> modify all the statements in psql & co to use AS so you can use things
>>> like
>>> \d and tab-completion without triggering the warnings.  I can implement
>>> those changes if others think this patch makes sense.
>> 
>> I think this is only problematic for column aliases.  I wouldn't want to 
>> put these two to be put into the same category, as I always omit the AS 
>> keyword for tables aliases (and will continue to do so), but never omit 
>> it for column aliases.
> I agree on being able to pick the behavior independently for select-list
> items and the FROM clause items.
> 
> Using this to mitigate the pl/pgsql column mis-match issue seems like too
> broad of a solution.
> 
> I probably couldn't mount a convincing defense of my opinion but at first
> blush I'd say we should pass on this.  Not with prejudice - looking at the
> issue periodically has merit - but right now it seems to be mostly adding
> clutter to the code without significant benefit.
> 
> Tangential - I'd rather see something like "EXPLAIN (STATIC)" that would
> allow a user to quickly invoke a built-in static SQL analyzer on their
> query and have it point this kind of thing out.  Adding a catalog for
> STATIC configurations in much the same way we have text search
> configurations would allow extensions and users to define their own
> rulesets that could be attached to their ROLE "GUC:
> default_static_analyzer_ruleset" and also passed in as a modifier in the
> EXPLAIN invocation.
> 
> Stuff like correlated subqueries without alias use could be part of that
> (to avoid typos that result in constant true) and also duplicate column
> names floating to the top-most select-list, or failure to use an alias on
> a function call result.  There are probably others though I'm stretching
> to even find these...
> 
> Anyway, the idea of using a GUC and evaluating every query that is written
> (the added if statements), is not that appealing even if the impact of one
> more check is likely to be insignificant (is it?)

To protect users on every query they write there would need to be some kind
of "always explain first and only execute if no warnings are thrown"
mode...and ideally some way to then override that on a per-query basis if
you know you are correct and don't want to fix the SQL...

If the static check fails the query itself would error and the detail would
contain the status result of the static analysis; otherwise the query should
return as normal.

This at least avoids having to introduce 10 different GUC just to
accommodate this feature and neatly bundles them into named packages.  

A warn-and-execute mode would be possible as well.  I would make it
incompatible with "autocommit" mode so that a user doing this would have a
chance to evaluate the warnings and rollback even if the statement ran to
completion successfully.

David J.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PATCH-parser-optionally-warn-about-missing-AS-for-column-and-table-aliases-tp5817971p5817982.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Bruce Momjian
Дата:
On Fri, Sep  5, 2014 at 02:33:00PM -0700, David G Johnston wrote:
> This at least avoids having to introduce 10 different GUC just to
> accommodate this feature and neatly bundles them into named packages.  
> 
> A warn-and-execute mode would be possible as well.  I would make it
> incompatible with "autocommit" mode so that a user doing this would have a
> chance to evaluate the warnings and rollback even if the statement ran to
> completion successfully.

Our TODO list had the idea of a "novice" mode that would warn about such
things.


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Marko Tiikkaja
Дата:
On 2014-09-05 11:19 PM, David G Johnston wrote:
> Marko Tiikkaja-4 wrote
>> I think this is only problematic for column aliases.  I wouldn't want to
>> put these two to be put into the same category, as I always omit the AS
>> keyword for tables aliases (and will continue to do so), but never omit
>> it for column aliases.
>
> I agree on being able to pick the behavior independently for select-list
> items and the FROM clause items.
>
> Using this to mitigate the pl/pgsql column mis-match issue seems like too
> broad of a solution.

The PL/PgSQL column can be solved via other methods; see for example my 
suggestion of "PL/PgSQL warning - num_into_expressions" in the current 
commit fest (and the non-backwards-compatible solution of throwing a 
run-time error I suggested).  But some people run SQL queries from their 
apps, and I've seen people make this mistake and get confused.

> I probably couldn't mount a convincing defense of my opinion but at first
> blush I'd say we should pass on this.  Not with prejudice - looking at the
> issue periodically has merit - but right now it seems to be mostly adding
> clutter to the code without significant benefit.

Are you talking about this particular option, or the notion of parser 
warnings in general?  Because if we're going to add a handful of 
warnings, I don't see why this couldn't be on that list.

> Tangential - I'd rather see something like "EXPLAIN (STATIC)" that would
> allow a user to quickly invoke a built-in static SQL analyzer on their query
> and have it point this kind of thing out.  Adding a catalog for STATIC
> configurations in much the same way we have text search configurations would
> allow extensions and users to define their own rulesets that could be
> attached to their ROLE "GUC: default_static_analyzer_ruleset" and also
> passed in as a modifier in the EXPLAIN invocation.

If you knew there's a good chance you might make a mistake, you could 
probably avoid doing it in the first place.  I make mistakes all the 
time, would I then always execute two commands when I want to do 
something?  Having to run a special "check my query please" command 
seems silly.

> Anyway, the idea of using a GUC and evaluating every query that is written
> (the added if statements), is not that appealing even if the impact of one
> more check is likely to be insignificant (is it?)

I'm not sure how you would do this differently in the EXPLAIN (STATIC) 
case.  Those ifs have to be somewhere, and that's always a non-zero cost.

That being said, yes, performance costs of course have to be evaluated 
carefully.


.marko



Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
David Johnston
Дата:
On Fri, Sep 5, 2014 at 6:27 PM, Marko Tiikkaja <marko@joh.to> wrote:
On 2014-09-05 11:19 PM, David G Johnston wrote:
Marko Tiikkaja-4 wrote

I probably couldn't mount a convincing defense of my opinion but at first
blush I'd say we should pass on this.  Not with prejudice - looking at the
issue periodically has merit - but right now it seems to be mostly adding
clutter to the code without significant benefit.

Are you talking about this particular option, or the notion of parser warnings in general?  Because if we're going to add a handful of warnings, I don't see why this couldn't be on that list.

​This option implemented in this specific manner.​


Tangential - I'd rather see something like "EXPLAIN (STATIC)" that would
allow a user to quickly invoke a built-in static SQL analyzer on their query
and have it point this kind of thing out.  Adding a catalog for STATIC
configurations in much the same way we have text search configurations would
allow extensions and users to define their own rulesets that could be
attached to their ROLE "GUC: default_static_analyzer_ruleset" and also
passed in as a modifier in the EXPLAIN invocation.

If you knew there's a good chance you might make a mistake, you could probably avoid doing it in the first place.  I make mistakes all the time, would I then always execute two commands when I want to do something?  Having to run a special "check my query please" command seems silly.

​My follow-on post posited a solution that still uses the EXPLAIN mechanism but configured in a way so users can have it be always on if desired.​


Anyway, the idea of using a GUC and evaluating every query that is written
(the added if statements), is not that appealing even if the impact of one
more check is likely to be insignificant (is it?)

I'm not sure how you would do this differently in the EXPLAIN (STATIC) case.  Those ifs have to be somewhere, and that's always a non-zero cost.

That being said, yes, performance costs of course have to be evaluated carefully.

​If you add lots more checks that is lots more ifs compared to a single if to see if static analysis should be attempted in the first place.  For a single option its largely a wash.  Beyond that I don't know enough to say whether having the parser dispatch the query differently based upon it being preceded with "EXPLAIN (STATIC)" or a standard query would be any faster than a single IF for a single check.​

​The more options you can think of for a "novice" mode the more appealing a formal static analysis interface becomes - both for execution and also because the number of options being introduced and managed can be made into formal configurations instead of an independent but related set of GUC variables.

​David J.
 

Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Marko Tiikkaja
Дата:
On 2014-09-05 11:33 PM, David G Johnston wrote:
> To protect users on every query they write there would need to be some kind
> of "always explain first and only execute if no warnings are thrown"
> mode...and ideally some way to then override that on a per-query basis if
> you know you are correct and don't want to fix the SQL...
>
> If the static check fails the query itself would error and the detail would
> contain the status result of the static analysis; otherwise the query should
> return as normal.

This feels even sillier.  Instinctively, if you can contain the 
functionality into the EXPLAIN path only, I don't see why you couldn't 
do it in a single  if (..)  for every query.  I doubt you could ever 
measure that difference.

> This at least avoids having to introduce 10 different GUC just to
> accommodate this feature and neatly bundles them into named packages.

I disagree.  Even if there was such a "static analysis" mode, I think 
there would have to be some way of filtering some of them out.

But "10 difference GUCs" can still be avoided; see 
plpgsql.extra_warnings, for example.


.marko



Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

От
Pavel Stehule
Дата:



2014-09-06 0:53 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-05 11:33 PM, David G Johnston wrote:
To protect users on every query they write there would need to be some kind
of "always explain first and only execute if no warnings are thrown"
mode...and ideally some way to then override that on a per-query basis if
you know you are correct and don't want to fix the SQL...

If the static check fails the query itself would error and the detail would
contain the status result of the static analysis; otherwise the query should
return as normal.

This feels even sillier.  Instinctively, if you can contain the functionality into the EXPLAIN path only, I don't see why you couldn't do it in a single  if (..)  for every query.  I doubt you could ever measure that difference.

This at least avoids having to introduce 10 different GUC just to
accommodate this feature and neatly bundles them into named packages.

I disagree.  Even if there was such a "static analysis" mode, I think there would have to be some way of filtering some of them out.

But "10 difference GUCs" can still be avoided; see plpgsql.extra_warnings, for example.

You used a good name for these mode in other thread "defensive". We can support some  "defensive" mode.  Personally I am sure, so if somebody would to use it, it would to use it as complex. Too less granularity increase a complexity of Postgres configuration and we don't would it.

Novice mode is not correct name and can has degrading character.

In this mode people maybe got more very specific warnings (DBA doesn't like it, because logs are massive), developers should to use explicit casting much more (developers doesn't like explicit casting in SQL). But when we use a good name, then they can accept it and use it. It is paradox, so defensive mode is mainly for professionals with experience - absolute  beginner probably will hate this mode due too restrictivity
 
Regards

Pavel



.marko



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers