Обсуждение: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

Поиск
Список
Период
Сортировка

FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
Josh Berkus
Дата:
Folks:
   SELECT       device_id,       count(*)::INT as present,       count(*)::INT FILTER (WHERE valid) as valid_count,
 mode()::INT WITHIN GROUP (order by val) as mode,       percentile_disc(0.5)::INT WITHIN GROUP (order by val)
asmedian   FROM dataflow_0913   GROUP BY device_id   ORDER BY device_id;
 
   ERROR:  syntax error at or near "FILTER"   LINE 4:         count(*)::INT FILTER (WHERE valid)           as
valid_count,


The error is right, that's invalid syntax.  I can't insert a ::INT
between the aggregate() and FILTER.  However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
think).

Is there some kind of possible HINT we could add to make this easier to
debug?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
"David G. Johnston"
Дата:
On Thu, Apr 16, 2015 at 12:22 PM, Josh Berkus <josh@agliodbs.com> wrote:
Folks:

    SELECT
        device_id,
        count(*)::INT as present,
        count(*)::INT FILTER (WHERE valid) as valid_count,
        mode()::INT WITHIN GROUP (order by val) as mode,
        percentile_disc(0.5)::INT WITHIN GROUP (order by val)
          as median
    FROM dataflow_0913
    GROUP BY device_id
    ORDER BY device_id;

    ERROR:  syntax error at or near "FILTER"
    LINE 4:         count(*)::INT FILTER (WHERE valid)
            as valid_count,


The error is right, that's invalid syntax.  I can't insert a ::INT
between the aggregate() and FILTER.  However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
think).


​​SELECT count(*)::int OVER ()
FROM ( VALUES (1),(2),(3) ) src;
 
Is there some kind of possible HINT we could add to make this easier to
debug?

​Do you have a suggested hint so that the effort to make it work can be compared to its usefulness?


​For kicks I ran the following - since "val::type" is simply another syntax for "type(val)"...

SELECT ceil(count(*)) OVER ()
FROM ( VALUES (1),(2),(3) ) src

SQL Error: ERROR:  OVER specified, but ceil is not a window function nor an aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

The non-hint has been around as long as window functions and hasn't really come up as an issue - not enough so to motivate a change at least.  Is the idiomatic usage of FILTER and WITHIN GROUP making this error more likely?

The foot-gun in your blog post is more problematic but also seemingly impossible to avoid except through education of the user.  It would not be unreasonable to accept that the current error is acting like a canary and forcing the user to go read the documentation on OVER/FILTER/WITHIN GROUP and learn to write the expression as a single unit.  

If this is not covered adequately enough in the documentation then that should be remedied.  Did you evaluate the documentation in that light while preparing your blog post?

David J.

Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
Josh Berkus
Дата:
On 04/16/2015 01:01 PM, David G. Johnston wrote:
> If this is not covered adequately enough in the documentation then that
> should be remedied.  Did you evaluate the documentation in that light
> while preparing your blog post?

Your response seems very defensive.  I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
"David G. Johnston"
Дата:
On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus <josh@agliodbs.com> wrote:
On 04/16/2015 01:01 PM, David G. Johnston wrote:
> If this is not covered adequately enough in the documentation then that
> should be remedied.  Did you evaluate the documentation in that light
> while preparing your blog post?

Your response seems very defensive.  I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.


Mostly because ​I have no clue how difficult it would be​...

​I'm all for improvement but obviously if it was that simple you would have just proposed the desired wording and asked someone to commit it.  I was trying to understand (and communicate my understanding) why it wasn't done better in the first place.

Beyond that there is still the need to teach the user the correct syntax of these functions - and not only point out when they get it wrong so they blindly fix their typo - in order to better avoid the situation where the syntax is valid but the outcome is unexpected.  Two separate problems of which the later seems more important than the former - and probably somewhat easier to implement.

David J.


Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
Josh Berkus
Дата:
On 04/16/2015 01:18 PM, David G. Johnston wrote:
> On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus <josh@agliodbs.com
> <mailto:josh@agliodbs.com>>wrote:
>
>     On 04/16/2015 01:01 PM, David G. Johnston wrote:
>     > If this is not covered adequately enough in the documentation then that
>     > should be remedied.  Did you evaluate the documentation in that light
>     > while preparing your blog post?
>
>     Your response seems very defensive.  I'm unclear on why discussing
>     improving an error message should get such a hostile response; it
>     certainly doesn't invite a serious reply.
>
>
> Mostly because ​I have no clue how difficult it would be​...
>
> ​I'm all for improvement but obviously if it was that simple you would
> have just proposed the desired wording and asked someone to commit it.
> I was trying to understand (and communicate my understanding) why it
> wasn't done better in the first place.

Right.  First, I don't know that it's possible for the parser to figure
out that error as opposed to actual mistakes in the filter clause, or
just straight gibberish.  Second, I'm not sure how to write a succinct
HINT which gets the point across.

However, the error we're giving for windowing suggests that we can
improve things.  In the example you quote:

SQL Error: ERROR:  OVER specified, but ceil is not a window function nor
an aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

... that's actually a lot clearer error.   However, we still get the
same error with casting:

josh=# select count(*)::INT OVER (order by device_id, val) from
dataflow_0913 ;
ERROR:  syntax error at or near "OVER"
LINE 1: select count(*)::INT OVER (order by device_id, val) from dat...

... so maybe we can't.

> Beyond that there is still the need to teach the user the correct syntax
> of these functions - and not only point out when they get it wrong so
> they blindly fix their typo - in order to better avoid the situation
> where the syntax is valid but the outcome is unexpected.  Two separate
> problems of which the later seems more important than the former - and
> probably somewhat easier to implement.

Absolutely.  Hence my blog post; I'm hoping that people will at least
get a google hit on the error text.

The problem is, in the main docs, how do we reasonably document
*combining* features?  In Syntax?  My head hurts just thinking about it.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>     ERROR:  syntax error at or near "FILTER"
>     LINE 4:         count(*)::INT FILTER (WHERE valid) as valid_count,

> The error is right, that's invalid syntax.  I can't insert a ::INT
> between the aggregate() and FILTER.  However, the error message is also
> rather confusing to the user; they're likely to look for their mistake
> in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
> think).

> Is there some kind of possible HINT we could add to make this easier to
> debug?

Probably not; or at least, IMO it would be a fundamental misjudgment to
go in with the idea of improving this one single case.  It'd be cool if
we could improve reporting of syntax errors in general, though.

Unfortunately Bison doesn't provide a whole heckuva lot of support for
that.  The only simple thing it offers is %error-verbose, which I've
experimented with in the past and come away with the impression that
it'd be completely useless for most people :-(.  It seems to basically
add information about which tokens would be valid next at the point of
the error report.  That isn't any help for the sort of user conceptual
error you're talking about here.

You could imagine teaching yyerror() to have some SQL-specific knowledge
that it could apply by comparing the current lookahead token to the
current parse state stack ... but neither of those things are exposed
to it by Bison.  We could probably get the lookahead token indirectly
by inspecting the lexer's state, but it's not clear that's enough to
do much.  In this particular example, that would mean showing the exact
same hint whenever a syntax error is reported at an OVER token; and
I'm afraid that it'd be mighty hard to write a hint for that that
wouldn't frequently do more harm than good.  (Note in particular that
since OVER, FILTER, and WITHIN are unreserved keywords, they might be
used as simple column/table/function names.)
        regards, tom lane



Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

От
Josh Berkus
Дата:
On 04/16/2015 04:36 PM, Tom Lane wrote:
> You could imagine teaching yyerror() to have some SQL-specific knowledge
> that it could apply by comparing the current lookahead token to the
> current parse state stack ... but neither of those things are exposed
> to it by Bison.  We could probably get the lookahead token indirectly
> by inspecting the lexer's state, but it's not clear that's enough to
> do much.  In this particular example, that would mean showing the exact
> same hint whenever a syntax error is reported at an OVER token; and

Yeah, and that's what we *don't* want to do.  That's why I was wondering
if it was even possible to figure out the "extra syntax" case.

> I'm afraid that it'd be mighty hard to write a hint for that that
> wouldn't frequently do more harm than good.  (Note in particular that
> since OVER, FILTER, and WITHIN are unreserved keywords, they might be
> used as simple column/table/function names.)

No kidding.  Heck, I just spent 1/2 hour figuring out a bug which was
actually caused by the fact that a user created a view called "mode".

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com