Обсуждение: Identifying a message in emit_log_hook.

Поиск
Список
Период
Сортировка

Identifying a message in emit_log_hook.

От
Kyotaro HORIGUCHI
Дата:
Hello.

I'm planning to change the error level of a specific error
message.  We can use emit_log_hook for this purpose but we have
no reliable means to identify what a message is.

For messages without valid sqlerrcode, filename:lineno could be
used to identify but lineno is too unstable.  One possible and
most straightforward way to solve this is defining identifiers
covering all error messages but such identifiers are too hard to
manage.  ErrorData.message could also be used but NLS translation
and placeholders prevent it from being identified by simple means
like strcmp(3).

As a solution for this problem, I'd like to porpose to have an
additional member in the struct ErrorData to hold a message id,
that is, the format string for errmsg(). This is not best but
useful enough.

It is somewhat a crude way, but the attached small patch would
do.

Is there any opinions or suggestions?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 9005b26..2d13101 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -801,6 +801,7 @@ errmsg(const char *fmt,...)    CHECK_STACK_DEPTH();    oldcontext =
MemoryContextSwitchTo(edata->assoc_context);
+    edata->message_id = fmt;    EVALUATE_MESSAGE(edata->domain, message, false, true);
MemoryContextSwitchTo(oldcontext);
@@ -830,6 +831,7 @@ errmsg_internal(const char *fmt,...)    CHECK_STACK_DEPTH();    oldcontext =
MemoryContextSwitchTo(edata->assoc_context);
+    edata->message_id = fmt;    EVALUATE_MESSAGE(edata->domain, message, false, false);
MemoryContextSwitchTo(oldcontext);
@@ -853,6 +855,7 @@ errmsg_plural(const char *fmt_singular, const char *fmt_plural,    CHECK_STACK_DEPTH();
oldcontext= MemoryContextSwitchTo(edata->assoc_context);
 
+    edata->message_id = fmt_singular;    EVALUATE_MESSAGE_PLURAL(edata->domain, message, false);
MemoryContextSwitchTo(oldcontext);
@@ -1361,6 +1364,7 @@ elog_finish(int elevel, const char *fmt,...)    recursion_depth++;    oldcontext =
MemoryContextSwitchTo(edata->assoc_context);
+    edata->message_id = fmt;    EVALUATE_MESSAGE(edata->domain, message, false, false);
MemoryContextSwitchTo(oldcontext);
@@ -1420,6 +1424,7 @@ format_elog_string(const char *fmt,...)    oldcontext = MemoryContextSwitchTo(ErrorContext);
+    edata->message_id = fmt;    EVALUATE_MESSAGE(edata->domain, message, false, true);
MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 326896f..4df76da 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -354,6 +354,7 @@ typedef struct ErrorData    char       *detail_log;        /* detail error message for server log
only*/    char       *hint;            /* hint message */    char       *context;        /* context message */
 
+    const char *message_id;        /* message id of .message */    char       *schema_name;    /* name of schema */
char      *table_name;        /* name of table */    char       *column_name;    /* name of column */ 

Re: Identifying a message in emit_log_hook.

От
Pavel Stehule
Дата:
Hi

2016-02-16 10:47 GMT+01:00 Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>:
Hello.

I'm planning to change the error level of a specific error
message.  We can use emit_log_hook for this purpose but we have
no reliable means to identify what a message is.

For messages without valid sqlerrcode, filename:lineno could be
used to identify but lineno is too unstable.  One possible and
most straightforward way to solve this is defining identifiers
covering all error messages but such identifiers are too hard to
manage.  ErrorData.message could also be used but NLS translation
and placeholders prevent it from being identified by simple means
like strcmp(3).

As a solution for this problem, I'd like to porpose to have an
additional member in the struct ErrorData to hold a message id,
that is, the format string for errmsg(). This is not best but
useful enough.

It is somewhat a crude way, but the attached small patch would
do.

Is there any opinions or suggestions?

It looks like workaround. The fixing missing sqlerrcode is much better.

Regards

Pavel

 

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 9005b26..2d13101 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -801,6 +801,7 @@ errmsg(const char *fmt,...)
        CHECK_STACK_DEPTH();
        oldcontext = MemoryContextSwitchTo(edata->assoc_context);

+       edata->message_id = fmt;
        EVALUATE_MESSAGE(edata->domain, message, false, true);

        MemoryContextSwitchTo(oldcontext);
@@ -830,6 +831,7 @@ errmsg_internal(const char *fmt,...)
        CHECK_STACK_DEPTH();
        oldcontext = MemoryContextSwitchTo(edata->assoc_context);

+       edata->message_id = fmt;
        EVALUATE_MESSAGE(edata->domain, message, false, false);

        MemoryContextSwitchTo(oldcontext);
@@ -853,6 +855,7 @@ errmsg_plural(const char *fmt_singular, const char *fmt_plural,
        CHECK_STACK_DEPTH();
        oldcontext = MemoryContextSwitchTo(edata->assoc_context);

+       edata->message_id = fmt_singular;
        EVALUATE_MESSAGE_PLURAL(edata->domain, message, false);

        MemoryContextSwitchTo(oldcontext);
@@ -1361,6 +1364,7 @@ elog_finish(int elevel, const char *fmt,...)
        recursion_depth++;
        oldcontext = MemoryContextSwitchTo(edata->assoc_context);

+       edata->message_id = fmt;
        EVALUATE_MESSAGE(edata->domain, message, false, false);

        MemoryContextSwitchTo(oldcontext);
@@ -1420,6 +1424,7 @@ format_elog_string(const char *fmt,...)

        oldcontext = MemoryContextSwitchTo(ErrorContext);

+       edata->message_id = fmt;
        EVALUATE_MESSAGE(edata->domain, message, false, true);

        MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 326896f..4df76da 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -354,6 +354,7 @@ typedef struct ErrorData
        char       *detail_log;         /* detail error message for server log only */
        char       *hint;                       /* hint message */
        char       *context;            /* context message */
+       const char *message_id;         /* message id of .message */
        char       *schema_name;        /* name of schema */
        char       *table_name;         /* name of table */
        char       *column_name;        /* name of column */


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Identifying a message in emit_log_hook.

От
Simon Riggs
Дата:
On 16 February 2016 at 09:47, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
 
I'm planning to change the error level of a specific error
message.  We can use emit_log_hook for this purpose but we have
no reliable means to identify what a message is.

For messages without valid sqlerrcode, filename:lineno could be
used to identify but lineno is too unstable.  One possible and
most straightforward way to solve this is defining identifiers
covering all error messages but such identifiers are too hard to
manage.  ErrorData.message could also be used but NLS translation
and placeholders prevent it from being identified by simple means
like strcmp(3).

As a solution for this problem, I'd like to porpose to have an
additional member in the struct ErrorData to hold a message id,
that is, the format string for errmsg(). This is not best but
useful enough.

It is somewhat a crude way, but the attached small patch would
do.

Is there any opinions or suggestions?

First, I support the concept of message ids and any work you do in moving this forwards.

If you were to require message ids, how would this work for extensions? Many things write to the log, so this solution would rely upon 100% compliance with your new approach. I think that is unlikely to happen.

I suggest an approach that allows optionally accessing a message id by hashing the English message text. That would allow people to identify messages without relying on people labelling everything correctly, as well as writing filters that do not depend upon language.

I'm guessing this would require making the pre-translated error text available to plugins as well as translated form.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Identifying a message in emit_log_hook.

От
Kyotaro HORIGUCHI
Дата:
Hello, thank you for the opinions,

At Tue, 16 Feb 2016 10:57:33 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
<CAFj8pRCg0GpZwVB3ZdAw8YriWnZ+Ty6twQOCqpmXgcTvYXX-mQ@mail.gmail.com>
> Hi
> 
> 2016-02-16 10:47 GMT+01:00 Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp>:
> 
> > Hello.
> >
> > I'm planning to change the error level of a specific error
> > message.  We can use emit_log_hook for this purpose but we have
> > no reliable means to identify what a message is.
> >
> > For messages without valid sqlerrcode, filename:lineno could be
> > used to identify but lineno is too unstable.  One possible and
> > most straightforward way to solve this is defining identifiers
> > covering all error messages but such identifiers are too hard to
> > manage.  ErrorData.message could also be used but NLS translation
> > and placeholders prevent it from being identified by simple means
> > like strcmp(3).
> >
> > As a solution for this problem, I'd like to porpose to have an
> > additional member in the struct ErrorData to hold a message id,
> > that is, the format string for errmsg(). This is not best but
> > useful enough.
> >
> > It is somewhat a crude way, but the attached small patch would
> > do.
> >
> > Is there any opinions or suggestions?
> >
> 
> It looks like workaround. The fixing missing sqlerrcode is much better.

Inventing a consistent system of sqlerrcode covering whole of the
code would be an unsolvable problem. Even allocating sequence
number or such to every error messages would be difficult to
maintain.

The "message id" called in this thread is a text finally passed
to gettext(3) as "msgid". Although it is under the context of
translation, it is usable when identifying an error by its
messages text is enough.

From such an aspect, this could be called both of a workaround or
a solution.


At Tue, 16 Feb 2016 10:21:30 +0000, Simon Riggs <simon@2ndQuadrant.com> wrote in
<CANP8+jLX3Z10T=dZtRBGUWRgCfZtLiPf0=QaNmFwciaFv=H1cA@mail.gmail.com>
> First, I support the concept of message ids and any work you do in moving
> this forwards.

Thank you for supporting this.

> If you were to require message ids, how would this work for extensions?
> Many things write to the log, so this solution would rely upon 100%
> compliance with your new approach. I think that is unlikely to happen.

I'm uncertain about the "message ids" you mention, but as written
above, it is exactly the text passed to elog() or ereport(), then
passed to gettext(3) as the parameter "msgid". Of couse this
cannot be reliable for certain usage but usable enough for some
usage.

> I suggest an approach that allows optionally accessing a message id by
> hashing the English message text. That would allow people to identify
> messages without relying on people labelling everything correctly, as well
> as writing filters that do not depend upon language.

The hash code could be used for screening if it can be calculated
at compile time, but we cannot detect synonyms because of lack of
a central storage so anyway the pre-translated error text still
should be available to exntensions.

> I'm guessing this would require making the pre-translated error text
> available to plugins as well as translated form.

If I understand you correctly, edata->messgage_id in my patch is
just what offers the pre-translated error text to plugins.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: Identifying a message in emit_log_hook.

От
Simon Riggs
Дата:
On 17 February 2016 at 08:34, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

> I'm guessing this would require making the pre-translated error text
> available to plugins as well as translated form.

If I understand you correctly, edata->messgage_id in my patch is
just what offers the pre-translated error text to plugins.

OK, now I understand the patch, I am happy to apply it.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Identifying a message in emit_log_hook.

От
Kyotaro HORIGUCHI
Дата:
Hello,

At Wed, 17 Feb 2016 09:13:01 +0000, Simon Riggs <simon@2ndQuadrant.com> wrote in
<CANP8+jLbGE_YbxULgZXvce44oOB8V0T93e5_inHvBDE2PXkSOw@mail.gmail.com>
> On 17 February 2016 at 08:34, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> 
> >
> > > I'm guessing this would require making the pre-translated error text
> > > available to plugins as well as translated form.
> >
> > If I understand you correctly, edata->messgage_id in my patch is
> > just what offers the pre-translated error text to plugins.
> 
> 
> OK, now I understand the patch, I am happy to apply it.

Thank you very much. I have one concern about this patch.

I have added an id only for .message in the patch but it
theoretically can apply to all other message typs eventually
given to gettext() in two macros EVALUATE_MESSAGE(_PLURAL). They
are detail, detail_log, hint and context and the modification
could be limited within the two macros by doing so but extra four
*_id members are to be added to ErrorData. I doubt it is useful
for the extra members.

If you agree with this, it doesn't seem to me to need more
modification. Is there anything else to do?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: Identifying a message in emit_log_hook.

От
Simon Riggs
Дата:
On 25 February 2016 at 07:42, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,

At Wed, 17 Feb 2016 09:13:01 +0000, Simon Riggs <simon@2ndQuadrant.com> wrote in <CANP8+jLbGE_YbxULgZXvce44oOB8V0T93e5_inHvBDE2PXkSOw@mail.gmail.com>
> On 17 February 2016 at 08:34, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> >
> > > I'm guessing this would require making the pre-translated error text
> > > available to plugins as well as translated form.
> >
> > If I understand you correctly, edata->messgage_id in my patch is
> > just what offers the pre-translated error text to plugins.
>
>
> OK, now I understand the patch, I am happy to apply it.

Thank you very much. I have one concern about this patch.

I have added an id only for .message in the patch but it
theoretically can apply to all other message typs eventually
given to gettext() in two macros EVALUATE_MESSAGE(_PLURAL). They
are detail, detail_log, hint and context and the modification
could be limited within the two macros by doing so but extra four
*_id members are to be added to ErrorData. I doubt it is useful
for the extra members.

If you agree with this, it doesn't seem to me to need more
modification. Is there anything else to do?

David,

Can you add this to the CF? It was submitted before deadline.
I presume you have access to do that?

Other people may want to see this before commit.

If you can't, I'll commit anyway, but if we have a system we should follow it.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Identifying a message in emit_log_hook.

От
David Steele
Дата:
Hi Simon,

On 3/10/16 7:26 AM, Simon Riggs wrote:
>
> Can you add this to the CF? It was submitted before deadline.
> I presume you have access to do that?

No problem - here it is:

https://commitfest.postgresql.org/9/576/

-- 
-David
david@pgmasters.net



Re: Identifying a message in emit_log_hook.

От
Kyotaro HORIGUCHI
Дата:
Thnak you for scooping up this.

At Thu, 10 Mar 2016 08:14:09 -0500, David Steele <david@pgmasters.net> wrote in <56E17321.5050201@pgmasters.net>
> Hi Simon,
> 
> On 3/10/16 7:26 AM, Simon Riggs wrote:
> >
> > Can you add this to the CF? It was submitted before deadline.
> > I presume you have access to do that?
> 
> No problem - here it is:
> 
> https://commitfest.postgresql.org/9/576/

Some kind of hash code can be added for shotrcut, but this is
usable even after it is added.

One arguable point I see now on this is only ids for the message
type "message" would be enough, or needed for some other types
such as "details". This is quite straightforward so I see no
other arguable point other than the code itself.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: Identifying a message in emit_log_hook.

От
Kyotaro HORIGUCHI
Дата:
I found that this has been commited.
Thank you for committing this, Simon.

regards,

At Tue, 15 Mar 2016 12:22:05 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20160315.122205.08265186.horiguchi.kyotaro@lab.ntt.co.jp>
> Thnak you for scooping up this.
> 
> At Thu, 10 Mar 2016 08:14:09 -0500, David Steele <david@pgmasters.net> wrote in <56E17321.5050201@pgmasters.net>
> > Hi Simon,
> > 
> > On 3/10/16 7:26 AM, Simon Riggs wrote:
> > >
> > > Can you add this to the CF? It was submitted before deadline.
> > > I presume you have access to do that?
> > 
> > No problem - here it is:
> > 
> > https://commitfest.postgresql.org/9/576/
> 
> Some kind of hash code can be added for shotrcut, but this is
> usable even after it is added.
> 
> One arguable point I see now on this is only ids for the message
> type "message" would be enough, or needed for some other types
> such as "details". This is quite straightforward so I see no
> other arguable point other than the code itself.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center