Обсуждение: Upgrading the backend's error-message infrastructure

Поиск
Список
Период
Сортировка

Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
(Or, protocol upgrade phase 1...)

After digging through our many past discussions of what to do with error
messages, I have put together the following first-cut proposal.  Fire at
will...


Objective
---------

The basic objective here is to divide error reports into multiple
fields, and in particular to include an "error code" field that gives
applications a stable value to test against when they're trying to find
out what went wrong.  (I am not spending much space in this proposal on
the question of exactly what set of error codes we ought to have, but
that comes soon.)  Peter Eisentraut argued cogently awhile back that the
error codes ought not be hard-wired to specific error message texts,
so this proposal treats them as separate entities.


Wire-protocol changes
---------------------

Error and Notice (maybe also Notify?) msgs will have this structure:
Ex string \0x string \0x string \0\0

where the x's are single-character field identifiers.  A frontend should
simply ignore any unrecognized fields.  Initially defined fields for Error
and Notice are:

S    Severity --- the string is "ERROR", "FATAL", or "PANIC" (if E msg)or "WARNING", "NOTICE", "DEBUG", "INFO", or
"LOG"(if N msg).(Should this string be localizable?  Probably, assuming that theE/N distinction is all the client
libraryreally cares about.)
 
C    Code --- SQLSTATE code for error (a 5-character string per SQLspec).  Not localizable.
M    Message --- the string is the primary error message (localized).
D    Detail --- secondary error message, carrying more detail aboutthe problem (localized).
H    Hint --- a suggestion what to do about the error (localized).
P    Position --- the string is a decimal ASCII integer, indicatingan error cursor position as an index into the
originalquerystring.  First character is index 1.  Q: measure index inbytes, or characters?  Latter seems preferable
consideringthatan encoding conversion may have occurred.
 
F    File --- file name of source-code location where error wasreported (__FILE__)
L    Line # --- line number of source-code location (__LINE__)
R    Routine --- source code routine name reporting error (__func__ or__FUNCTION__)

S,C,M fields will always appear (at least in Error messages; perhaps
Notices might omit C?).  The rest are optional.

Why three textual message fields?  'M' should always appear, 'D' and 'H'
are optional (and relatively rare).  The convention is that the primary
'M' message should be accurate but terse (normally one line); if more info
is needed than can reasonably fit on a line, use the detail message to
carry additional lines.  A "hint" is something that doesn't directly
describe the error, but is a suggestion what to do to get around it.
'M' and 'D' should be factual, whereas 'H' may contain some guesswork, or
advice that might not always apply.  Client interfaces are expected to
report 'M', but might suppress 'D' and/or 'H' depending on factors such as
screen space.  (Preferably they should have a verbose mode that shows all
available info, though.)


Error codes
-----------

The SQL spec defines a set of 5-character status codes (called SQLSTATE
values).  We'll use these as the language-independent identifiers for
error conditions.  There is code space reserved by the spec for
implementation-defined error conditions, which we'll surely need.

Per spec, each of the five characters in a SQLSTATE code must be a digit
'0'-'9' or an upper-case Latin letter 'A'-'Z'.  So it's possible to fit a
SQLSTATE code into a 32-bit integer with some simple encoding conventions.
I propose that we use such a representation in the backend; that is,
instead of passing around strings like "1200D" we pass around integers
formed like ((('1' - '0') << 6) + '2' - '0') << 6 ...  This should save
a useful amount of space per elog call site, and it won't obscure the code
noticeably since all the common values will be represented as macro names
anyway, something like

#define ERRCODE_DIVISION_BY_ZERO   MAKE_SQLSTATE('2','2', '0','1','2')

We need to do some legwork to figure out what set of
implementation-defined error codes we want.  It might make sense to look
and see what other DBMSes are using.


Backend source-code representation for extended error messages
--------------------------------------------------------------

How do we generalize the elog() interface to cope with all this stuff?
I don't think I want a function with a fixed parameter list --- some sort
of open-ended API would be a lot more forward-looking.  After some fooling
around I've come up with the following proposal.

A typical elog() call might be replaced by
ereport(ERROR, ERRCODE_INTERNAL,    errmsg("Big trouble with table %s", name),    errhint("Bail out now, boss"));

ERROR is the severity level, same as before, and ERRCODE_xxx is (a macro
for) the appropriate SQLSTATE code.  The rest is a variable-length list of
optional items, each expressed as a subsidiary function call.  This
representation preserves the single-function-call appearance of elog()
calls, which is convenient for coding purposes, but it gives us something
akin to labeled optional parameters instead of C's usual fixed parameter
list.

How does this work, exactly?  Well, errmsg() and errhint() are indeed
functions, but ereport is actually a macro:

#define ereport    errstart(__FILE__, __LINE__, __FUNCTION__), errfinish

(__FUNCTION__ is only used if we are compiling in gcc).  errstart() pushes
an empty entry onto an error-data-collection stack and fills in the
behind-the-scenes file/line entries.  errmsg() and friends stash values
into the top-level stack entry.  Finally errfinish() assembles and emits
the completed message, then pops the stack.  By using a stack, we can be
assured that things will work correctly if a message is logged by some
subroutine called in the parameters to ereport (not too unlikely when you
think about formatting functions like format_type_be()).

Behind the scenes we have

extern void errstart(const char *filename, int lineno, const char *funcname);
extern void errfinish(int elevel, int sqlerrorcode, ...);

The individual routines for adding optional items to the error report are:

extern int errmsg(const char *fmt, ...);

Primary error message, possibly with parameters interpolated per the
existing elog conventions (sprintf-like format string).  The first
parameter is gettext-ified.  Primary messages should be one line if at
all possible (make it complete but succinct).

extern int errdetail(const char *fmt, ...);

Adds an optional secondary error message, for use when not all the
description of an error condition can be fit into a reasonably terse
primary error message.  Functionality essentially the same as errmsg().
errdetail output can run to multiple lines, but bear in mind that some
client APIs may not show it.

extern int errhint(const char *fmt, ...);

Adds a "hint"; behavior otherwise similar to errdetail().

An example is that the existing       elog(ERROR, "Adding columns with defaults is not implemented."
"\n\tAddthe column, then use ALTER TABLE SET DEFAULT.");
 
becomesereport(ERROR, ERRCODE_something,        errmsg("Adding columns with defaults is not implemented"),
errhint("Addthe column, then use ALTER TABLE SET DEFAULT"));
 
Notice that we got rid of a hard-wired decision about presentation layout.

extern int errmsg_internal(const char *fmt, ...);

Like errmsg() except that the first parameter is not subject to
gettext-ification.  My thought is that this would be used for internal
can't-happen conditions; there's no need to make translators labor over
translating stuff like "eval_const_expressions: unexpected boolop %d",
nor even to make them think about whether they need to.  The only part
of such a message that needs internationalization is the hint "Please
report this problem to pgsql-bugs", which should be added automatically
by errmsg_internal().  The ERRCODE should almost always be "internal
error" if this is used.

extern int errfunction(const char *funcname);

Provides the name of the function reporting the error.  In gcc-compiled
backends, the function name will be provided automatically by errstart,
but there will be some places where we need the name to be available even
in a non-gcc build.  My thought is thatelog(WARNING, "PerformPortalFetch: portal \"%s\" not found",
stmt->portalname);
becomesereport(WARNING, ERRCODE_INVALID_CURSOR_NAME,        errmsg("portal \"%s\" not found", stmt->portalname),
errfunction("PerformPortalFetch"));
This gets us out of the habit of including function name in the primary
error message, while still leaving enough info that we can construct
a backwards-compatible error report for old clients.  (I'm thinking that
if errfunction() is present, the function name and a colon would be
prepended to the primary error message, but only if sending to an
old-protocol client.)

extern int errposition(int cursorpos);

Provides error position info (an offset into the original query text).
For the moment this is probably only going to happen for scanner and
grammar errors.


NOTE: a variant scheme would treat the SQLSTATE code as an optional
parameter too, ie you'd writeereport(ERROR, errcode(ERRCODE_xxx), ...
This would just be excess verbiage if most or all ereport calls specify
error codes --- but for the errmsg_internal case we could leave out
errcode(), expecting it to default to "internal error".  Any thoughts on
which way is better?


Backwards compatibility
-----------------------

When talking to an old-protocol client, the ereport code will assemble the
appropriate elements of the available data to produce an approximately
backward-compatible message, that is, ye oldeERROR:  routine: primary message
(where routine: appears only if errfunction() was called).

elog() will remain available for at least a couple of releases, so as not
to force immediate updates of user-written extension functions.  It will
default to some implementation-defined SQLSTATE value for "unspecified
error".  We can change "elog" into a macro similar to "ereport" so that we
can get file/line number info.  (This means we're only giving source-code
not object-code compatibility with extension functions, but that's
generally the case anyway during PG major version updates.)


Comments?
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
Larry Rosenman
Дата:

--On Thursday, March 13, 2003 15:51:00 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:


> (__FUNCTION__ is only used if we are compiling in gcc).  errstart() pushes
> an empty entry onto an error-data-collection stack and fills in the
> behind-the-scenes file/line entries.  errmsg() and friends stash values
> into the top-level stack entry.  Finally errfinish() assembles and emits
> the completed message, then pops the stack.  By using a stack, we can be
> assured that things will work correctly if a message is logged by some
> subroutine called in the parameters to ereport (not too unlikely when you
> think about formatting functions like format_type_be()).
>
__FUNCTION__ or an equivalent is MANDATED by C99, and available on 
UnixWare's native cc.

You might want to make a configure test for it.

I believe the __func__ is the C99 spelling (that's what's available on 
UnixWare):

$ cc -O -o testfunc testfunc.c
$ ./testfunc
function=main,file=testfunc.c,line=4
$ cat testfunc.c
#include <stdio.h>
int main(int argc,char **argv)
{ printf("function=%s,file=%s,line=%d\n",__func__,__FILE__,__LINE__);
}
$

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749





Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> __FUNCTION__ or an equivalent is MANDATED by C99, and available on 
> UnixWare's native cc.
> You might want to make a configure test for it.

Right, __func__ is the C99 spelling.  I did have a configure test in
mind here: __func__ or __FUNCTION__ or NULL is what would get compiled
in.  One nice thing about this approach is that we need change only one
place to adjust the set of behind-the-scenes error parameters.
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
Larry Rosenman
Дата:

--On Thursday, March 13, 2003 16:20:21 -0500 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> Larry Rosenman <ler@lerctr.org> writes:
>> __FUNCTION__ or an equivalent is MANDATED by C99, and available on
>> UnixWare's native cc.
>> You might want to make a configure test for it.
>
> Right, __func__ is the C99 spelling.  I did have a configure test in
> mind here: __func__ or __FUNCTION__ or NULL is what would get compiled
> in.  One nice thing about this approach is that we need change only one
> place to adjust the set of behind-the-scenes error parameters.
>
Ok, you had said GCC only.  Please do use the configure test, and __func__ 
if it's available.

Thanks,
LER


-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749





Re: Upgrading the backend's error-message infrastructure

От
Neil Conway
Дата:
On Thu, 2003-03-13 at 15:51, Tom Lane wrote:
> After digging through our many past discussions of what to do with error
> messages, I have put together the following first-cut proposal.

Great work, Tom!

While we're effectively changing every elog call site in the backend,
would it also be a good idea to adopt a standard for the format of error
messages? (e.g. capitalization, grammar, etc.)

> extern int errmsg_internal(const char *fmt, ...);
> 
> Like errmsg() except that the first parameter is not subject to
> gettext-ification.  My thought is that this would be used for internal
> can't-happen conditions; there's no need to make translators labor over
> translating stuff like "eval_const_expressions: unexpected boolop %d",
> nor even to make them think about whether they need to.

If we wanted to get fancy, we could make use of the glibc ability to
generate a back trace programatically:

http://www.gnu.org/manual/glibc-2.2.5/html_node/Backtraces.html#Backtraces

> In gcc-compiled
> backends, the function name will be provided automatically by errstart,
> but there will be some places where we need the name to be available even
> in a non-gcc build.

To be honest, I'd be sceptical whether there are enough platforms
without *either* gcc or a C99 compiler that it's worthwhile worrying
about them that much (all that is at stake is some backward
compatibility, anyway).

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Neil Conway <neilc@samurai.com> writes:
> While we're effectively changing every elog call site in the backend,
> would it also be a good idea to adopt a standard for the format of error
> messages? (e.g. capitalization, grammar, etc.)

Yup.  I was planning to bring that up as a separate thread.  I think
Peter has already put some thought into it, but I couldn't find anything
in the archives...

> If we wanted to get fancy, we could make use of the glibc ability to
> generate a back trace programatically:

Hmm ... maybe.  Certainly we all too often ask people to get this info
by hand ... too bad it only works in glibc though.

>> In gcc-compiled
>> backends, the function name will be provided automatically by errstart,
>> but there will be some places where we need the name to be available even
>> in a non-gcc build.

> To be honest, I'd be sceptical whether there are enough platforms
> without *either* gcc or a C99 compiler that it's worthwhile worrying
> about them that much (all that is at stake is some backward
> compatibility, anyway).

I'm only planning to bother with the errfunction hack for messages that
I know are being specifically tested for by existing frontends.  ecpg
looks for "PerformPortalFetch" messages, for example.  If we don't keep
that name in the (old version of the) error message then we have a
compatibility problem.  But I do want to move away from having function
names in the primary error message text.
        regards, tom lane


Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Jean-Luc Lachance <jllachan@nsd.ca> writes:
> Why trade 5 characters for a 4 byte integer -- a saving of 1 byte?

It's more than that: in one case you have something on the order of
a "load immediate" instruction, whereas in the other case the code
is like "load pointer to global string", plus you need a 6-byte string
literal (maybe costing you 8 bytes depending on alignment
considerations).  Also, depending on your machine's approach to
addressing of global data, that "load pointer" thingy could be multiple
instructions.  So we're talking about at least six, possibly 8-12 bytes
per elog call --- and there are thousands of 'em in the backend.

Admittedly, it's a micro-optimization, but it seems worth doing since it
won't have any direct impact on code legibility.
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
"Christopher Kings-Lynne"
Дата:
> Comments?

All the error stuff sounds really neat.  I volunteer for doing lots of elog
changes when the time comes.

Would it be possible to do a command line app?

bash$ pg_error 1200D
Severity: ERROR
Message: Division by zero
Detail:
Hint: Modify statement to prevent zeros appearing in denominators.

So people can look up errors offline (oracle-style)

Chris



Re: Upgrading the backend's error-message infrastructure

От
"Christopher Kings-Lynne"
Дата:
> Great work, Tom!
> 
> While we're effectively changing every elog call site in the backend,
> would it also be a good idea to adopt a standard for the format of error
> messages? (e.g. capitalization, grammar, etc.)

I 100% agree with this - a style guide!

Chris



Re: Upgrading the backend's error-message infrastructure

От
Neil Conway
Дата:
On Thu, 2003-03-13 at 21:16, Christopher Kings-Lynne wrote:
> Would it be possible to do a command line app?
> 
> bash$ pg_error 1200D
> Severity: ERROR
> Message: Division by zero
> Detail:
> Hint: Modify statement to prevent zeros appearing in denominators.

Is there any benefit to having this over just including an index of
error codes in the documentation?

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Upgrading the backend's error-message infrastructure

От
Larry Rosenman
Дата:

--On Thursday, March 13, 2003 21:44:29 -0500 Neil Conway 
<neilc@samurai.com> wrote:

> On Thu, 2003-03-13 at 21:16, Christopher Kings-Lynne wrote:
>> Would it be possible to do a command line app?
>>
>> bash$ pg_error 1200D
>> Severity: ERROR
>> Message: Division by zero
>> Detail:
>> Hint: Modify statement to prevent zeros appearing in denominators.
>
> Is there any benefit to having this over just including an index of
> error codes in the documentation?
yes, it makes it script-able, and probably more up to date than 
documentation....

Especially if it's in the DB or from the source code.

LER

>
> Cheers,
>
> Neil
>
> --
> Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749





Re: Upgrading the backend's error-message infrastructure

От
"Christopher Kings-Lynne"
Дата:
> On Thu, 2003-03-13 at 21:16, Christopher Kings-Lynne wrote:
> > Would it be possible to do a command line app?
> >
> > bash$ pg_error 1200D
> > Severity: ERROR
> > Message: Division by zero
> > Detail:
> > Hint: Modify statement to prevent zeros appearing in denominators.
>
> Is there any benefit to having this over just including an index of
> error codes in the documentation?

It's quick and easy, especially when there's thousands of error codes.
Ideally, the pg_error app and the error code documentation should be
automatically generated...

You could have a built-in function: pg_print_error(text) returns text, then
the pg_error command line program could just call that, plus the user could
check up errors from within postgresql as well...

Chris



Re: Upgrading the backend's error-message infrastructure

От
Neil Conway
Дата:
On Thu, 2003-03-13 at 21:48, Larry Rosenman wrote:
> > Is there any benefit to having this over just including an index of
> > error codes in the documentation?

> yes, it makes it script-able

What need would you have for it to be script-able? The backend will
return the error text whenever it returns an error code -- in what
situation would a client app have the error code but not the error
message as well?

> and probably more up to date than documentation....

The way to fix that is to keep the documentation up to date, not invent
pseudo-documentation.

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Upgrading the backend's error-message infrastructure

От
Larry Rosenman
Дата:

--On Thursday, March 13, 2003 21:58:21 -0500 Neil Conway 
<neilc@samurai.com> wrote:

> On Thu, 2003-03-13 at 21:48, Larry Rosenman wrote:
>> > Is there any benefit to having this over just including an index of
>> > error codes in the documentation?
>
>> yes, it makes it script-able
>
> What need would you have for it to be script-able? The backend will
> return the error text whenever it returns an error code -- in what
> situation would a client app have the error code but not the error
> message as well?
PHP, returning just the code to a user, and wanting to, later, return the 
full text
in a log-analysis, just as one example.
>
>> and probably more up to date than documentation....
>
> The way to fix that is to keep the documentation up to date, not invent
> pseudo-documentation.
machine readable Error messages and Codes are ALWAYS a good thing, IMNSHO.


>
> Cheers,
>
> Neil
>
> --
> Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
>
>



-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749





Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> Would it be possible to do a command line app?
> 
> bash$ pg_error 1200D
> Severity: ERROR
> Message: Division by zero
> Detail:
> Hint: Modify statement to prevent zeros appearing in denominators.

You're assuming that there's a one-to-one mapping of error codes to
messages, which is not likely to be the case --- for example, all the
"can't happen" errors will probably get lumped together under a single
"internal error" error code.  You could provide a lookup of the
spec-defined meaning of each error code, maybe.

>> Is there any benefit to having this over just including an index of
>> error codes in the documentation?

> It's quick and easy, especially when there's thousands of error codes.

But there aren't.  I count about 130 SQLSTATEs defined by the spec.
Undoubtedly we'll make more for Postgres-specific errors, but not
hundreds more.  There's just not value to applications in distinguishing
errors at such a fine grain.
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
Darko Prenosil
Дата:
On Thursday 13 March 2003 20:51, Tom Lane wrote:
> (Or, protocol upgrade phase 1...)
>
> After digging through our many past discussions of what to do with error
> messages, I have put together the following first-cut proposal.  Fire at
> will...
>
>
> Objective
> ---------
>
> The basic objective here is to divide error reports into multiple
> fields, and in particular to include an "error code" field that gives
> applications a stable value to test against when they're trying to find
> out what went wrong.  (I am not spending much space in this proposal on
> the question of exactly what set of error codes we ought to have, but
> that comes soon.)  Peter Eisentraut argued cogently awhile back that the
> error codes ought not be hard-wired to specific error message texts,
> so this proposal treats them as separate entities.
>
>
What about user messages ?
If I remember correct, MSSQL had a system catalog table with formated error
messages, and it was possible to raise error with error number and it's
parameters. It can be very useful when you must raise same error from
different places in the code. It is very useful when you need to translate
error messages to another language for example. I think that there was a
range of error numbers reserved for user error messages.

Maybe even system messages can be stored in same way.
OK, there is problem how to raise an error if you can sp_connect and get the
error message (because an error is in sp_connect) ???

Just an Idea (from M$) !


Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Darko Prenosil <darko.prenosil@finteh.hr> writes:
> What about user messages ? 
> If I remember correct, MSSQL had a system catalog table with formated error 
> messages, and it was possible to raise error with error number and it's 
> parameters. It can be very useful when you must raise same error from 
> different places in the code.

But that's exactly the direction we are *not* going in.  We had that
discussion a long time ago when we first started internationalizing
our error messages.  Peter Eisentraut convinced everybody that we did
not want to tie error codes to unique error messages.  [digs in archives
...] See for example
http://fts.postgresql.org/db/mw/msg.html?mid=1279991
I have no desire to revisit that choice.

There is nothing to stop you from creating your own user-defined
messages, and even adding them to the .po files in your installation
if the need strikes.  We aren't going to store them in any system table,
however.
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
Þórhallur Hálfdánarson
Дата:
-*- Tom Lane <tgl@sss.pgh.pa.us> [ 2003-03-14 15:33 ]:
> Darko Prenosil <darko.prenosil@finteh.hr> writes:
> > What about user messages ? 
> > If I remember correct, MSSQL had a system catalog table with formated error 
> > messages, and it was possible to raise error with error number and it's 
> > parameters. It can be very useful when you must raise same error from 
> > different places in the code.
> 
> But that's exactly the direction we are *not* going in.  We had that
> discussion a long time ago when we first started internationalizing
> our error messages.  Peter Eisentraut convinced everybody that we did
> not want to tie error codes to unique error messages.  [digs in archives
> ...] See for example
> http://fts.postgresql.org/db/mw/msg.html?mid=1279991
> I have no desire to revisit that choice.
> 
> There is nothing to stop you from creating your own user-defined
> messages, and even adding them to the .po files in your installation
> if the need strikes.  We aren't going to store them in any system table,
> however.

What about the option of having error numbers unique, but have error numbers linked to error messages (unique in code,
butshare strings).
 

Just my .02 ISK.


-- 
Regards,
Tolli
tolli@tol.li


Re: Upgrading the backend's error-message infrastructure

От
johnnnnnn
Дата:
On Thu, Mar 13, 2003 at 03:51:00PM -0500, Tom Lane wrote:
> Wire-protocol changes
> ---------------------
> 
> Error and Notice (maybe also Notify?) msgs will have this structure:
> 
>     E
>     x string \0
>     x string \0
>     x string \0
>     \0
> 
> where the x's are single-character field identifiers.  A frontend should
> simply ignore any unrecognized fields.  Initially defined fields for Error
> and Notice are:

...

> S,C,M fields will always appear (at least in Error messages; perhaps
> Notices might omit C?).  The rest are optional.

It strikes me that this error response could be made slimmer by
removing the text fields.

It makes sense for P, F, L, and R to be returned when available, as
they're specific to the instance of the error. C is clearly necessary,
as well. S is questionable, though, depending on whether (for every C
there is one, and only one S).

But the others are going to be the same for every instance of a given
C. It would seem to make more sense to me to provide a different
function(s) which allows the lookup Messages, Details, and Hints based
on the SQLSTATE.

The benefits that i see would be:

- Less clutter and wasted space on the wire. If we are concerned
enough about space to reduce the SQLSTATE to an integer mapping,
removing all the extra text should be a big win. Couple this with the
libraries' ability to now do things like cache messages, or not bother
to retrieve messages for certain SQLSTATEs, and the benefit gets
larger.

- Removal of localization from error/notice generation libraries. This
should make that section of code simpler and more fault-tolerant. It
also allows libraries to do potentially weird stuff like using
multiple different locales per connection, so long as they can specify
a locale for the lookup functions.

Does that make sense, or am i missing something?

-johnnnnnnnnnn



Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
johnnnnnn <john@phaedrusdeinus.org> writes:
> It would seem to make more sense to me to provide a different
> function(s) which allows the lookup Messages, Details, and Hints based
> on the SQLSTATE.

This would constrain us to have a different SQLSTATE for every error
message, which we aren't going to do.  See elsewhere in thread.  It's
also unclear how you insert parameters into error strings if you do this.

> - Less clutter and wasted space on the wire.

I am not really concerned about shaving bytes transmitted for an error
condition.  If that's a performance-critical path for your app, you need
to rewrite the app ;-)

> - Removal of localization from error/notice generation libraries. This
> should make that section of code simpler and more fault-tolerant.

And you put it where, instead?

The existing scheme for localization works fine AFAICT.  I don't have
any interest in reinventing it (nor any chance of getting this done for
7.4, if I were to try...)
        regards, tom lane


Re: Upgrading the backend's error-message infrastructure

От
johnnnnnn
Дата:
On Fri, Mar 14, 2003 at 12:23:04PM -0500, Tom Lane wrote:
> > It would seem to make more sense to me to provide a different
> > function(s) which allows the lookup Messages, Details, and Hints
> > based on the SQLSTATE.
> 
> This would constrain us to have a different SQLSTATE for every error
> message, which we aren't going to do.

That makes sense -- i was assuming a one-to-one mapping (or, at least,
many-to-one in the other direction: many SQLSTATEs for the same
"Unknown error" message).

I'm not sure i follow the reasoning behind allowing multiple messages
for a single SQLSTATE, though. I would think that having the
machine-readable portion of the error be the most granular would make
sense. I can't imagine the SQLSTATE space being too small for us at
this point.

If it's different enough to warrant a different message, then, in my
mind, it's different enough to warrant a different SQLSTATE.

> It's also unclear how you insert parameters into error strings if
> you do this.

That's valid, but there are other ways of dealing with it. The
position in the SQL statement has been moved out to another item in
the response, so why not move the table, column, index, or whatnot
into another item(s) as well?

> > - Removal of localization from error/notice generation
> > libraries. This should make that section of code simpler and more
> > fault-tolerant.
> 
> And you put it where, instead?

Sorry, i think i phrased that poorly. What i meant was that the
functions which provide lookups would need to be aware of locale
because they're referencing localized strings. The functions which are
specifically generating and transmitting the errors, on the other
hand, would be free of localized strings, so would not have to rely on
any of the locale infrastructure at all.

I'm not suggesting any change in the scheme for localization or
anything like that, just saying that limiting the internal access
points might make things cleaner.

The usual other benefits should result as well: simpler unit tests,
easier maintenance, etc.

-johnnnnnnnnnnnn


Re: Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
johnnnnnn <john@phaedrusdeinus.org> writes:
> If it's different enough to warrant a different message, then, in my
> mind, it's different enough to warrant a different SQLSTATE.

Unfortunately, you're at odds with the SQL spec authors, who have made
their intentions pretty clear by defining only about 130 standard
SQLSTATEs: the granularity is supposed to be coarse.  To take one
example, there's just a single SQLSTATE for "division by zero".  One
might or might not want different messages for float vs integer zero
divide, but they're going to have the same SQLSTATE.

My feeling is that the spec authors knew what they were doing, at least
for the purposes they intended SQLSTATE to be used for.  Applications
want to detect errors at a granularity corresponding to what their
recovery choices might be.  For example, apps want to distinguish
"unique key violation" from "zero divide" because they probably have
something they can do about a unique-key problem.  They *don't* want
"unique key violation" to be broken down into forty-seven subvariants
(especially not implementation-specific subvariants) because that just
makes it difficult to detect the condition reliably --- it's almost as
bad as having to look at an error message text.

We could possibly invent a unique code for each message that is separate
from SQLSTATE, but that idea was considered and rejected some time ago
for what seem to me good reasons: it adds a lot of
bookkeeping/maintenance effort for far too little return.  Ultimately,
the source code is the authoritative database for the set of possible
errors, and trying to put that authority someplace else is just not
worth the effort.  (Besides, we already have tools that can extract
information from the source code at need --- gettext does exactly this
to prepare the NLS files.)


>> It's also unclear how you insert parameters into error strings if
>> you do this.

> That's valid, but there are other ways of dealing with it. The
> position in the SQL statement has been moved out to another item in
> the response, so why not move the table, column, index, or whatnot
> into another item(s) as well?

Because then the reassembly becomes the front-end's problem.  This was
in fact an approach I proposed a year or two back, and it was
(correctly, in hindsight) shot down.  We have multiple frontend libraries
and only one backend, so it's better to do this sort of thing once in
the backend.  There is not enough payback from making each frontend have
to implement it.  There is a good reason for separating out position ---
different frontends are going to want to handle syntax-error marking
differently (consider psql vs some kind of windowed GUI).  But there's
no corresponding bang for the buck in making every frontend handle
localization issues.
        regards, tom lane


Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Peter Eisentraut
Дата:
Tom Lane writes:

> Error and Notice (maybe also Notify?) msgs will have this structure:
>
>     E
>     x string \0
>     x string \0
>     x string \0
>     \0
>
> where the x's are single-character field identifiers.

I think we need more flexible field tags.  The SQL standards has
provisions for more fields accompanying error messages, such as the name
of the affected table.  (See <condition information item name> for a the
list.)  I think it would be nice if applications could easily access, say,
the name of the constraint that was violated.

> NOTE: a variant scheme would treat the SQLSTATE code as an optional
> parameter too, ie you'd write
>     ereport(ERROR, errcode(ERRCODE_xxx), ...
> This would just be excess verbiage if most or all ereport calls specify
> error codes --- but for the errmsg_internal case we could leave out
> errcode(), expecting it to default to "internal error".  Any thoughts on
> which way is better?

I have a feeling that most errors are of the "internal" class, either
those that are really a kind of assertion check (perhaps we should
consider an enhanced API for those in the future) or failed system or
library calls.  We would need to analyze that feeling a little more, but
if it's true then we might save some effort if the default error code
were "internal".

Then again, it might seem better if the default error code were closer in
nature to "default", meaning an unspecified error if the programmer
couldn't think of one (consider loadable modules).

Speaking of loadable modules, one feature that would be useful would be
the ability to select a different message catalog for translations.
Right now an elog(ERROR, "foo") call in a loaded module looks up "foo" in
the message catalog provided by the main server but it probably won't be
there.  This could look like

ereport(ERROR, errmsgdomain("plpgsql"), "...")

or maybe

ereport_domain("plpgsql", ERROR, ...);

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> where the x's are single-character field identifiers.

> I think we need more flexible field tags.  The SQL standards has
> provisions for more fields accompanying error messages, such as the name
> of the affected table.

Well, we can certainly add more tags, but do you foresee needing more
than 50 or so?  I'd prefer to stick to single-byte tags for space
reasons, even if they stop being very mnemonic at some point.  As long
as we don't run out of printable ASCII characters, it's easy on both the
sending and receiving sides to cope with single-byte tags.

I had missed the relevance of <condition information item name>, will
go look at it.

>> NOTE: a variant scheme would treat the SQLSTATE code as an optional
>> parameter too, ie you'd write
>> ereport(ERROR, errcode(ERRCODE_xxx), ...

> I have a feeling that most errors are of the "internal" class, either
> those that are really a kind of assertion check (perhaps we should
> consider an enhanced API for those in the future) or failed system or
> library calls.  We would need to analyze that feeling a little more, but
> if it's true then we might save some effort if the default error code
> were "internal".

Yeah, that was in the back of my mind too, but I hadn't got round to
counting to see if it's right.

> Then again, it might seem better if the default error code were closer in
> nature to "default", meaning an unspecified error if the programmer
> couldn't think of one (consider loadable modules).

Unconverted elog's will produce some such SQLSTATE, but if we can't
think of a better SQLSTATE for ones we *have* converted then I think we
need to think harder ;-)

This brings up something I had wanted to start a separate thread for,
which is exactly what SQLSTATEs do we want to define beyond those given
in the spec.  Any thoughts?

> Speaking of loadable modules, one feature that would be useful would be
> the ability to select a different message catalog for translations.
> Right now an elog(ERROR, "foo") call in a loaded module looks up "foo" in
> the message catalog provided by the main server but it probably won't be
> there.  This could look like

> ereport(ERROR, errmsgdomain("plpgsql"), "...")

Rather than cluttering the individual ereport calls with such a thing,
can't a loadable module just do something when it is loaded to add its
message catalog file to the set that will be searched?  (But it is
interesting that we *could* do it as you suggest.  This mechanism is
more flexible than I realized...)
        regards, tom lane


Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
I said:
> I had missed the relevance of <condition information item name>, will
> go look at it.

It looks to me like support of the SQL condition information items would
require adding about two dozen optional fields to my spec for the Error
protocol message, and the same number of optional errFOO(...)
subroutines in the ereport() interface (only two or three of which would
be likely to get invoked in any one ereport instance).  This is a bit
more than I'd been visualizing, but AFAICS the proposed mechanisms would
work perfectly well with it.  I won't bore the list with a detailed spec
for the individual items --- they seem pretty obvious.

Given that we now need order-of-thirty possible field types, do you feel
uncomfortable with a single-byte field identifier in the FE/BE protocol?
I'm still leaning that way on the grounds of compactness and programming
simplicity, but I can see where someone might want to argue it won't do
in the long run.
        regards, tom lane


Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Peter Eisentraut
Дата:
Tom Lane writes:

> Given that we now need order-of-thirty possible field types, do you feel
> uncomfortable with a single-byte field identifier in the FE/BE protocol?
> I'm still leaning that way on the grounds of compactness and programming
> simplicity, but I can see where someone might want to argue it won't do
> in the long run.

There's a possible solution:  SQL99 part 3 defines numerical codes for
each of these fields (table 12/section 5.14).  The codes are between
around 0 and 40.  (Don't be confused by the negative code numbers in the
table; those are only for use within ODBC.)

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Given that we now need order-of-thirty possible field types, do you feel
>> uncomfortable with a single-byte field identifier in the FE/BE protocol?

> There's a possible solution:  SQL99 part 3 defines numerical codes for
> each of these fields (table 12/section 5.14).  The codes are between
> around 0 and 40.

Hmm.  I can't see any advantage to these over assigning our own codes;
ours would have at least *some* mnemonic value, rather than being chosen
completely at random ...
        regards, tom lane


Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Peter Eisentraut
Дата:
Tom Lane writes:

> Hmm.  I can't see any advantage to these over assigning our own codes;
> ours would have at least *some* mnemonic value, rather than being chosen
> completely at random ...

One advantage is that interfaces that are required to use these constants
would not need an internal translation table.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Peter Eisentraut
Дата:
Tom Lane writes:

> M    Message --- the string is the primary error message (localized).
> D    Detail --- secondary error message, carrying more detail about
>     the problem (localized).
> H    Hint --- a suggestion what to do about the error (localized).

Client interfaces for the most part only have the notion of a single
"message text".  (And keep in mind that the definitions of most interfaces
are outside our control: JDBC, ODBC, ECPG, Perl DBI, PHP, etc.)  So what
kind of functionality is needed so that standardized interfaces can get at
any of the provided details and hints?

Maybe this doesn't need to be solved at the protocol layer.  Instead a
server-side switch regulates the detail of the provided messages.

Also, how do we control what amount of detail is written to the server
log?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Upgrading the backend's error-message infrastructure

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> M    Message --- the string is the primary error message (localized).
>> D    Detail --- secondary error message, carrying more detail about
>> the problem (localized).
>> H    Hint --- a suggestion what to do about the error (localized).

> Client interfaces for the most part only have the notion of a single
> "message text".  (And keep in mind that the definitions of most interfaces
> are outside our control: JDBC, ODBC, ECPG, Perl DBI, PHP, etc.)  So what
> kind of functionality is needed so that standardized interfaces can get at
> any of the provided details and hints?

I think this is a matter to be solved at the level of the API of each
client library.  For example, libpq's PQerrorMessage would presumably
construct some unified string out of these three fields and the error
severity; plus we'd add new calls to extract the individual fields.
I do not think it's appropriate to try to control this from the server
side of things.

> Also, how do we control what amount of detail is written to the server
> log?

Some GUC variables would do for that, probably, if we think it's a good
idea to be selective (a proposition I'm dubious about).
        regards, tom lane