A plan to improve error messages with context, hint and details.

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема A plan to improve error messages with context, hint and details.
Дата
Msg-id Pine.LNX.4.58.0403041742590.28778@sablons.cri.ensmp.fr
обсуждение исходный текст
Ответы Re: A plan to improve error messages with context, hint  (Dennis Bjorklund <db@zigo.dhs.org>)
Re: A plan to improve error messages with context, hint and details.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

Dear Hackers,


Motivation
----------

As a basic user of postgresql, I've been quite disappointed by the lack of
help provided by postgresql error messages dealing with syntax and
semantical errors, especially in long sql statements:
      ERROR: syntax error at or near "(" at character 326

This makes students feel angry against the software, the computer or even
the teacher (say, me;-), for the bad time they have dealing with syntactic
issues. It makes them turn to mysql;-)

I think it is important an issue, as the interface is the first contact
between the user and the software. The internals and features may be
great, but it is not bad either to help users to deal more easily with the
beast.

There are several points on which great improvements can be obtained
without much disruption in the current code structure.

Note that this plan is for "ERROR", where the processing is somehow
stopped, and the user must chose a course of action in order to solve the
problem. However, other level of reports (such as NOTICE or WARNING) may
also benefit from it.

Also, things may not be as simple as described, but the purpose of the
mail is to describe the ultimate goal, and to outline the path that I
think should lead to it.

So here is my suggested plan:


(1) Lexical/syntax error source localisation
--------------------------------------------

An extract of the offending source must be shown if possible along syntax
error messages.

This can be achieved very simply and at low cost, since all the
information is already there, as well as most needed fields in ErrorData.

However it may be required to handle multi-line details or an additionnal
sub-detail (I would chose that) in ErrorData in order to show a cursor:
ERROR: Syntax error at or near "(" at character 14DETAIL:  CREATE TABLE (id SERIAL ...DETAIL:               ^

The only actual issue seems to be multi-byte encodings in the buffer, but
I noticed that some support functions are already available.


(2) Hints about syntax errors
-----------------------------

All generated error messages, especially from the parser, should be
assorted with a HINT to help the user, if possible. Something like:
HINT: table name expected

This requires more work as all syntax error sources need to be catched and
a relevant HINT must be provided. There is a little bit of an issue here
as yyerror call to ereport is rather simplistic.

I would suggest to have a "current_hint" (scalar or maybe stack)
maintained by the parsor, that would be used by yyerror to fill the hint
field. The yacc code may look something like:

<code>
CreateUserStmt:   Create USER { hint("user id"); }   Userid { hint("user options or WITH"); }   opt_with { hint("user
options");}   OptUserList { ... };
 

Create: CREATE { hint("USER|DATABASE|SCHEMA|..."); };
</code>

The changes are pretty systematic and simple, and they do not modidy the
actual grammar of the parsor. However they should affect a lot of lines in
"gram.y", if not all.


(3) About semantical errors, which may be detected later on ...
---------------------------------------------------------------

... in the processing of the command. The problem is different because it
occurs in functions that can be called from quite different contexts, and
the context is not really known to the function. Thus when the error
occurs in the function, it cannot provide a useful context. As none is
provided, the user must guess...

For this I have used in the past the following trick: a stack describes
the processing context and is updated by functions with pushes and pops.
If an error occurs, the stack provides the context information needed,
something like:
CONTEXT: parsing user query

Or
CONTEXT: in create table "foo", in constraint "bla", checking reference types ...

This is basically a user-oriented view of the call stack to help with
error messages. It's really incremental, as if nothing is done the context
will be vague, but if some key functions care to update it precisely, the
context reported will be much more helpful.

It should typically be maintained in the error logging part, and used on
errors to build a context if none is provided, or to be provided as a
separate content next to "message, detail, hint, context". Also care must
be taken to reset the stack on errors. Note that the current "context"
management in elog allows multiple context information to be provided, but
I haven't seen any "pop" facility which would be needed by functions so as
to change the current context simply, the strings are just appended one
to the other.


Two questions
-------------

My research background is in code optimisation within compilers, but now I
mostly teach computer science stuff in engineering schools. I would be
interested in giving some of my time on these issues, but:

(1) Do postgresql "Masters" think this issue is worth being pursued, or   any patch will be rejected as it is
consideredintrinsicly useless?
 
   "Our users do not need hint or context information, only hard-core   engineers use postgresql, sissy guys will
ratheruse mysql"  ;-)
 
   Indeed, I don't mind having a patch being rejected because of my   poor programming, or because the result is not
fineenough, as   I can improve it and re-submit later.
 
   However, if the issue is considered useless, it means that   I will lose my time anyway, so I would prefer not to
giveit try;-)
 

(2) Does someone has any comment about these problems or   the way I intend to try to address them?
   Are they currently being addressed by someone else?   It doesn't look so from the TODO list.
   If the plan make sense, it may be added to the TODO list,   and I wish to claim it or part of it.

Have a nice day,

-- 
Fabien Coelho - coelho@cri.ensmp.fr


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Thread safe connection-name mapping in ECPG. Is it
Следующее
От: tswan@idigx.com
Дата:
Сообщение: Re: [pgsql-hackers-win32] Tablespaces