octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Undefined behavior sanitizing with Clang


From: Philipp Kutin
Subject: Re: Undefined behavior sanitizing with Clang
Date: Sat, 10 Aug 2013 13:53:39 +0200

On Wed, Aug 7, 2013 at 3:12 PM, Mike Miller <address@hidden> wrote:
>
> Not exactly. Try the following amended test program instead. Literals
> and variables are treated differently in the case of casting to an
> unsigned.
>
> ---
> #include <stdio.h>
> double num() { return -2.0; }
> int main() {
>   printf("%u\n", (unsigned)(-1.0));
>   printf("%u\n", (unsigned)(-2.0));
>   printf("%u\n", (unsigned)(num()));
> }
> ---

No. It's not that literals and variables are treated any differently,
it's that converting a double whose truncated value can't be
represented in the target integer type is *undefined behavior*, and as
a consequence, our sample programs are completely devoid of meaning
from the start. A compiler is not obliged to produce any particular
code for them at all.

What I wanted to show with my program is that the behavior I observed
is one possibility, of infinitely many as far as the Standard is
concerned. Usually, with UB code the behavior differs with factors
such as using a different compiler, or the same compiler with
different options, etc. etc. It was a bit unfortunate (for the point I
was trying to make) that in my case, all combinations gave the same
result, but in fact, you succeeded there: your line prints 4294967294
when compiled with -O0, but 0 with -O2.

I really recommend reading the acticles by John Regehr [1] and Chris
Lattner [2]. They shed light on why one cannot "rely" on UB code to
produce any particular result, why deducing from a given compiled
example to the general case is not allowed, and why undefined behavior
will be more of something to keep an eye on as compilers get better at
aggressively optimizing code. It's a bit surprising on the first read,
but makes perfect sense when seen from the compiler's (or its
authors') point of view: after all, its main job is to produce the
fastest possible code, as fast as possible.

> If Matlab's rand('state',s) accepts negative numbers, then ours does
> as well. The best we can do is map any possible input values to the
> uint32_t array in a deterministic way.

I see. Then I guess the most logical way to procees is to reduce each
element modulo 2^32, somewhat like this pseudo-code ("elem" is a
double):

if (isnan (elem))
    elem = <arbitrary-but-fixed>;
else
{
    elem = fmod(elem, 4294967296.0)  // UINT32_MAX+1.0
    // fmod() is the remainder on division that rounds toward 0, i.e.
its sign follows the numerator.
    // we want a number in [0 .. UINT32_MAX+1.0) though.
    if (elem < 0)
        elem = elem += UINT32_MAX+1.0;
}

Because currently, the behavior may only be called "deterministic" (in
big quotes) for a particular build, if the compiler was well-meaning
on that day.

[1] http://blog.regehr.org/archives/213
[2] http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

--Philipp


reply via email to

[Prev in Thread] Current Thread [Next in Thread]