Обсуждение: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Поиск
Список
Период
Сортировка

Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

just a word of warning: it seems as if there is compiler bug in clang
regarding the ternary operator when used in ereport(). While working
on a patch I found that this code:
    ereport(FATAL,            (errmsg("could not map anonymous shared memory: %m"),             (errno == ENOMEM) ?
       errhint("This error usually means that PostgreSQL's request "                     "for a shared memory segment
exceededavailable memory "                     "or swap space. To reduce the request size (currently "
  "%zu bytes), reduce PostgreSQL's shared memory usage, "                     "perhaps by reducing shared_buffers or "
                  "max_connections.",                     *size) : 0)); 

did not emit a errhint when using clang, although errno == ENOMEM was
true. The same code works with gcc. I used the same data dir, so
config was exactly the same, too.

I reported this bug at clang.org:

<http://llvm.org/bugs/show_bug.cgi?id=18644>

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

when I remove the errno comparison and use a 1 it works:

ereport(FATAL,(errmsg("could not map anonymous shared memory: %m"), 1 ? errhint("This error usually means that
PostgreSQL'srequest "         "for a shared memory segment exceeded available memory "         "or swap space. To
reducethe request size (currently "         "%zu bytes), reduce PostgreSQL's shared memory usage, "         "perhaps by
reducingshared_buffers or "         "max_connections.",         *size) : 0)); 

Same if I use an if(errno == ENOMEM) instead of the ternary operator.

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndQuadrant.com> writes:
> just a word of warning: it seems as if there is compiler bug in clang
> regarding the ternary operator when used in ereport(). While working
> on a patch I found that this code:
> ...
> did not emit a errhint when using clang, although errno == ENOMEM was
> true.

Huh.  I noticed a buildfarm failure a couple days ago in which the visible
regression diff was that an expected HINT was missing.  This probably
explains that, because we use ternary operators in this style in quite a
few places.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 28/01/14 16:43, Christian Kruse wrote:
>         ereport(FATAL,
>                 (errmsg("could not map anonymous shared memory: %m"),
>                  (errno == ENOMEM) ?
>                  errhint("This error usually means that PostgreSQL's request "
>                          "for a shared memory segment exceeded available memory "
>                          "or swap space. To reduce the request size (currently "
>                          "%zu bytes), reduce PostgreSQL's shared memory usage, "
>                          "perhaps by reducing shared_buffers or "
>                          "max_connections.",
>                          *size) : 0));
>
> did not emit a errhint when using clang, although errno == ENOMEM was
> true. The same code works with gcc.

According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
a compiler bug but a difference between gcc and clang. Clang seems to
use a left-to-right order of evaluation while gcc uses a right-to-left
order of evaluation. So if errmsg changes errno this would lead to
errno == ENOMEM evaluated to false. I added a watch point on errno and
it turns out that exactly this happens: in src/common/psprintf.c line
114
nprinted = vsnprintf(buf, len, fmt, args);

errno gets set to 0. This means that we will miss errhint/errdetail if
we use errno in a ternary operator and clang.

Should we work on this issue?

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndQuadrant.com> writes:
> According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
> a compiler bug but a difference between gcc and clang. Clang seems to
> use a left-to-right order of evaluation while gcc uses a right-to-left
> order of evaluation. So if errmsg changes errno this would lead to
> errno == ENOMEM evaluated to false.

Oh!  Yeah, that is our own bug then.

> Should we work on this issue?

Absolutely.  Probably best to save errno into a local just before the
ereport.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Jason Petersen
Дата:
I realize Postgres’ codebase is probably intractably large to begin using a tool like splint (http://www.splint.org ),
butthis is exactly the sort of thing it’ll catch. I’m pretty sure it would have warned in this case that the code
relieson an ordering of side effects that is left undefined by C standards (and as seen here implemented differently by
twodifferent compilers). 

The workaround is to make separate assignments on separate lines, which act as sequence points to impose a total order
onthe side-effects in question. 

—Jason

On Jan 28, 2014, at 2:12 PM, Christian Kruse <christian@2ndQuadrant.com> wrote:

> Hi,
>
> On 28/01/14 16:43, Christian Kruse wrote:
>>         ereport(FATAL,
>>                 (errmsg("could not map anonymous shared memory: %m"),
>>                  (errno == ENOMEM) ?
>>                  errhint("This error usually means that PostgreSQL's request "
>>                          "for a shared memory segment exceeded available memory "
>>                          "or swap space. To reduce the request size (currently "
>>                          "%zu bytes), reduce PostgreSQL's shared memory usage, "
>>                          "perhaps by reducing shared_buffers or "
>>                          "max_connections.",
>>                          *size) : 0));
>>
>> did not emit a errhint when using clang, although errno == ENOMEM was
>> true. The same code works with gcc.
>
> According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
> a compiler bug but a difference between gcc and clang. Clang seems to
> use a left-to-right order of evaluation while gcc uses a right-to-left
> order of evaluation. So if errmsg changes errno this would lead to
> errno == ENOMEM evaluated to false. I added a watch point on errno and
> it turns out that exactly this happens: in src/common/psprintf.c line
> 114
>
>     nprinted = vsnprintf(buf, len, fmt, args);
>
> errno gets set to 0. This means that we will miss errhint/errdetail if
> we use errno in a ternary operator and clang.
>
> Should we work on this issue?
>
> Best regards,
>
> --
> Christian Kruse               http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>




Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Andres Freund
Дата:
On 2014-01-28 16:19:11 -0500, Tom Lane wrote:
> Christian Kruse <christian@2ndQuadrant.com> writes:
> > According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
> > a compiler bug but a difference between gcc and clang. Clang seems to
> > use a left-to-right order of evaluation while gcc uses a right-to-left
> > order of evaluation. So if errmsg changes errno this would lead to
> > errno == ENOMEM evaluated to false.
> 
> Oh!  Yeah, that is our own bug then.

Pretty nasty too. Surprising that it didn't cause more issues. It's not
like it would only be capable to cause problems because of the
evaluation order...

> > Should we work on this issue?
> 
> Absolutely.  Probably best to save errno into a local just before the
> ereport.

I think just resetting to edata->saved_errno is better and sufficient?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Alvaro Herrera
Дата:
Jason Petersen wrote:
> I realize Postgres’ codebase is probably intractably large to begin
> using a tool like splint (http://www.splint.org ), but this is exactly
> the sort of thing it’ll catch. I’m pretty sure it would have warned in
> this case that the code relies on an ordering of side effects that is
> left undefined by C standards (and as seen here implemented
> differently by two different compilers).

Well, we already have Coverity reports and the VIVA64 stuff posted last
month.  Did they not see these problems?  Maybe they did, maybe not, but
since there's a large number of false positives it's hard to tell.  I
don't know how many false positives we would get from a Splint run, but
my guess is that it'll be a lot.

> The workaround is to make separate assignments on separate lines,
> which act as sequence points to impose a total order on the
> side-effects in question.

Not sure how that would work with a complex macro such as ereport.
Perhaps the answer is to use C99 variadic macros if available, but that
would leave bugs such as this one open on compilers that don't support
variadic macros.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Andres Freund
Дата:
On 2014-01-28 18:31:59 -0300, Alvaro Herrera wrote:
> Jason Petersen wrote:
> > I realize Postgres’ codebase is probably intractably large to begin
> > using a tool like splint (http://www.splint.org ), but this is exactly
> > the sort of thing it’ll catch. I’m pretty sure it would have warned in
> > this case that the code relies on an ordering of side effects that is
> > left undefined by C standards (and as seen here implemented
> > differently by two different compilers).
>
> Well, we already have Coverity reports and the VIVA64 stuff posted last
> month.  Did they not see these problems?  Maybe they did, maybe not, but
> since there's a large number of false positives it's hard to tell.  I
> don't know how many false positives we would get from a Splint run, but
> my guess is that it'll be a lot.

Well, this isn't really a case of classical undefined beaviour. Most of
the code is actually perfectly well setup to handle the differing
evaluation, it's just that some bits of code forgot to restore errno.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Stephen Frost
Дата:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Well, we already have Coverity reports and the VIVA64 stuff posted last
> month.  Did they not see these problems?  Maybe they did, maybe not, but
> since there's a large number of false positives it's hard to tell.  I
> don't know how many false positives we would get from a Splint run, but
> my guess is that it'll be a lot.

I've whittled down most of the false positives and gone through just
about all of the rest.  I do not recall any reports in Coverity for this
issue and that makes me doubt that it checks for it.

I'll try and take a look at what splint reports this weekend.
Thanks,
    Stephen

Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Alvaro Herrera
Дата:
Stephen Frost wrote:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> > Well, we already have Coverity reports and the VIVA64 stuff posted last
> > month.  Did they not see these problems?  Maybe they did, maybe not, but
> > since there's a large number of false positives it's hard to tell.  I
> > don't know how many false positives we would get from a Splint run, but
> > my guess is that it'll be a lot.
> 
> I've whittled down most of the false positives and gone through just
> about all of the rest.

Really?  Excellent, thanks.  I haven't looked at it in quite a while
apparently ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
>> Absolutely.  Probably best to save errno into a local just before the
>> ereport.

> I think just resetting to edata->saved_errno is better and sufficient?

-1 --- that field is nobody's business except elog.c's.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 28/01/14 22:35, Tom Lane wrote:
> >> Absolutely.  Probably best to save errno into a local just before the
> >> ereport.
>
> > I think just resetting to edata->saved_errno is better and sufficient?
>
> -1 --- that field is nobody's business except elog.c's.

Ok, so I propose the attached patch as a fix.

Best regards,

--
 Christian Kruse               http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Вложения

Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndquadrant.com> writes:
> Ok, so I propose the attached patch as a fix.

No, what I meant is that the ereport caller needs to save errno, rather
than assuming that (some subset of) ereport-related subroutines will
preserve it.

In general, it's unsafe to assume that any nontrivial subroutine preserves
errno, and I don't particularly want to promise that the ereport functions
are an exception.  Even if we did that, this type of coding would still
be risky.  Here are some examples:
  ereport(...,          foo() ? errdetail(...) : 0,          (errno == something) ? errhint(...) : 0);

If foo() clobbers errno and returns false, there is nothing that elog.c
can do to make this coding work.
  ereport(...,          errmsg("%s belongs to %s",      foo(), (errno == something) ? "this" : "that"));

Again, even if every single elog.c entry point saved and restored errno,
this coding wouldn't be safe.

I don't think we should try to make the world safe for some uses of errno
within ereport logic, when there are other very similar-looking uses that
we cannot make safe.  What we need is a coding rule that you don't do
that.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 29/01/14 13:39, Tom Lane wrote:
> No, what I meant is that the ereport caller needs to save errno, rather
> than assuming that (some subset of) ereport-related subroutines will
> preserve it.
> […]

Your reasoning sounds quite logical to me. Thus I did a

grep -RA 3 "ereport" src/* | less

and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Best regards,

--
 Christian Kruse               http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Вложения

Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 29/01/14 21:37, Christian Kruse wrote:
> […]
> attached you will find a patch addressing that issue.

Maybe we should include the patch proposed in

<20140129195930.GD31325@defunct.ch>

and do this as one (slightly bigger) patch. Attached you will find
this alternative version.

Best regards,

--
 Christian Kruse               http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Вложения

Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndquadrant.com> writes:
> Your reasoning sounds quite logical to me. Thus I did a
> grep -RA 3 "ereport" src/* | less
> and looked for ereport calls with errno in it. I found quite a few,
> attached you will find a patch addressing that issue.

Excellent, thanks for doing the legwork.  I'll take care of getting
this committed and back-patched.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndquadrant.com> writes:
> Your reasoning sounds quite logical to me. Thus I did a
> grep -RA 3 "ereport" src/* | less
> and looked for ereport calls with errno in it. I found quite a few,
> attached you will find a patch addressing that issue.

Committed.  I found a couple of errors in your patch, but I think
everything is addressed in the patch as committed.
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi Tom,

On 29/01/14 20:06, Tom Lane wrote:
> Christian Kruse <christian@2ndquadrant.com> writes:
> > Your reasoning sounds quite logical to me. Thus I did a
> > grep -RA 3 "ereport" src/* | less
> > and looked for ereport calls with errno in it. I found quite a few,
> > attached you will find a patch addressing that issue.
>
> Committed.

Great! Thanks!

> I found a couple of errors in your patch, but I think everything is
> addressed in the patch as committed.

While I understand most modifications I'm a little bit confused by
some parts. Have a look at for example this one:

+       *errstr = psprintf(_("failed to look up effective user id %ld: %s"),
+                          (long) user_id,
+                        errno ? strerror(errno) : _("user does not exist"));

Why is it safe here to use errno? It is possible that the _() function
changes errno, isn't it?

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Andres Freund
Дата:
On 2014-01-30 08:32:20 +0100, Christian Kruse wrote:
> Hi Tom,
> 
> On 29/01/14 20:06, Tom Lane wrote:
> > Christian Kruse <christian@2ndquadrant.com> writes:
> > > Your reasoning sounds quite logical to me. Thus I did a
> > > grep -RA 3 "ereport" src/* | less
> > > and looked for ereport calls with errno in it. I found quite a few,
> > > attached you will find a patch addressing that issue.
> > 
> > Committed.
> 
> Great! Thanks!
> 
> > I found a couple of errors in your patch, but I think everything is
> > addressed in the patch as committed.
> 
> While I understand most modifications I'm a little bit confused by
> some parts. Have a look at for example this one:
> 
> +       *errstr = psprintf(_("failed to look up effective user id %ld: %s"),
> +                          (long) user_id,
> +                        errno ? strerror(errno) : _("user does not exist"));
> 
> Why is it safe here to use errno? It is possible that the _() function
> changes errno, isn't it?

But the evaluation order is strictly defined here, no? First the boolean
check for errno, then *either* strerror(errno), *or* the _().

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 30/01/14 10:15, Andres Freund wrote:
> > While I understand most modifications I'm a little bit confused by
> > some parts. Have a look at for example this one:
> >
> > +       *errstr = psprintf(_("failed to look up effective user id %ld: %s"),
> > +                          (long) user_id,
> > +                        errno ? strerror(errno) : _("user does not exist"));
> >
> > Why is it safe here to use errno? It is possible that the _() function
> > changes errno, isn't it?
>
> But the evaluation order is strictly defined here, no? First the boolean
> check for errno, then *either* strerror(errno), *or* the _().

Have a look at the psprintf() call: we first have a _("failed to look
up effective user id %ld: %s") as an argument, then we have a (long)
user_id and after that we have a ternary expression using errno. Isn't
it possible that the first _() changes errno?

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Tom Lane
Дата:
Christian Kruse <christian@2ndquadrant.com> writes:
> Have a look at the psprintf() call: we first have a _("failed to look
> up effective user id %ld: %s") as an argument, then we have a (long)
> user_id and after that we have a ternary expression using errno. Isn't
> it possible that the first _() changes errno?

While I haven't actually read the gettext docs, I'm pretty sure that
gettext() is defined to preserve errno.  It's supposed to be something
that you can drop into existing printf's without thinking, and if
it mangled errno that would certainly not be the case.

If this isn't true, we've got probably hundreds of places that would
need fixing, most of them of the form printf(_(...), strerror(errno)).
        regards, tom lane



Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Christian Kruse
Дата:
Hi,

On 30/01/14 10:01, Tom Lane wrote:
> While I haven't actually read the gettext docs, I'm pretty sure that
> gettext() is defined to preserve errno.  It's supposed to be something
> that you can drop into existing printf's without thinking, and if
> it mangled errno that would certainly not be the case.

Thanks for your explanation. I verified reading the man page and it
explicitly says:

ERRORS      errno is not modified.


Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Christian Kruse <christian@2ndquadrant.com> writes:
> > Have a look at the psprintf() call: we first have a _("failed to look
> > up effective user id %ld: %s") as an argument, then we have a (long)
> > user_id and after that we have a ternary expression using errno. Isn't
> > it possible that the first _() changes errno?
> 
> While I haven't actually read the gettext docs, I'm pretty sure that
> gettext() is defined to preserve errno.  It's supposed to be something
> that you can drop into existing printf's without thinking, and if
> it mangled errno that would certainly not be the case.

It specifically says:

ERRORS      errno is not modified.


-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services