octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Undefined behavior sanitizing with Clang


From: Philipp Kutin
Subject: Undefined behavior sanitizing with Clang
Date: Mon, 5 Aug 2013 18:11:48 +0200

Hi,


I built Octave using Clang from SVN and
"-fsanitize=undefined,address", both dynamic checkers for C and C++.
The first one, UBSan, instruments the generated code to catch
"miscellaneous" undefined behavior such as casts of overlarge FP
numbers to integers. A starting point for reading is [1]. Some
background info can be found in [2] and [3]. The AddressSanitizer [4]
is a bit different -- it catches stuff like out-of-bounds accesses
(like the one fixed with patch #8152) and traps immediately. In
contrast, UBSan only writes diagnostics to standard error, presumably
since the kind of UB it catches is "less evil" (but still UB!). I'm
enabling the latter simply because I don't want to have
combinatorially many Octave builds lying around; this post is about
UBSan results.

After builing the classdef branch (actually classdef-pk), I ran the
test suite using "make check" and directed its standard output and
error streams to a file. That file, annotated with comments (in square
brackets) and symbolized where UBSan gave only addresses, is attached
to this mail. The file is best viewed in Emacs, some regexps to
highlight are provided at the beginning.


As expected with any large software project, the sanitizer exposed a
couple of issues, both of the interesting and the rather boring
variety. Let's start with the not-so-interesting ones.

* scripts/io/textscan.m:
liboctinterp.so.1:0x1772cb99: runtime error: load of value 4294963199,
which is not a valid value for type 'std::_Ios_Fmtflags'
[
std::operator&=(std::_Ios_Fmtflags&, std::_Ios_Fmtflags)
/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/ios_base.h:98
4294963199 is 0xffffefff, std::_Ios_Fmtflags is an enum.
]
This one is in the category "mixing signed and unsigned integers when
the highest bit is set". I think it should be harmless, but I haven't
yet read C++'s conversion rules for basic types in depth.

Another one in randmtzig.c is similar:
* randmtzig.c:271:71: runtime error: left shift of 147 by 24 places
cannot be represented in type 'int'
[
            entropy[n++] = word[0]+(word[1]<<8)+(word[2]<<16)+(word[3]<<24);
]
In C99, left-shifts invoke UB if a set bit gets shifted into or past
the highest bit position. However, I recall a GCC bug report [5] where
one of the devs asserted that GCC considers this defined behavior (as
long as the shift value is legal, of course), yielding the "expected"
result on the assumption that signed integers are represented using
two's complement. Fixing this should be easy -- just cast word[3] to
uint32_t before the shift.


Then there are prevalent messages about divisions by zero -- note that
this is undefined behavior in C99 (6.5.5#5) and C++11 (5.6#4)!
[At least in the drafts I'm using -- N843 and N3242, respectively.]
It's already obtained with something as simple as "1/0" from the
prompt. This one is actually a bit tricky. As far as I can see, C99
(haven't looked at C++11) doesn't specify a method of dividing in the
expected "silent" way (1/0 -> inf, 0/0 -> nan, etc.) So it might seem
that implementing divison necessiates resorting to assembly, but
curiously Intel's x86 instructions manual has a table for the FDIV
instruction that actually lists some combinations as producing traps!
I dunno...


Now about the more interesting ones.

Running "norm (single ([1e200, 1]))" gives
* liboctinterp.so.1:0xfc52e49: runtime error: value 1e+200 is outside
the range of representable values of type 'float'
and makes the test (expecting the norm to be 1e200) fail for me,
giving Inf instead.

Then there's
liboctave.so.1:0x1000c320: runtime error: value -1 is outside the
range of representable values of type 'unsigned long'
[
regexp::replace(std::string const&, std::string const&)
/home/pk/dl/octave-classdef-clang/build/liboctave/../../liboctave/util/regexp.cc:611
          from = static_cast<size_t> (p->end () - 1) + 1;
]
Here, an apparently floating-point -1 is cast to size_t, which is is
unsigned long on my system. (If the -1 was integral, it wouldn't be UB
per C++11 4.7#2.) I don't know which test is responsible for that one,
though.

In the same category is the following:
liboctave.so.1:0xe244b00: runtime error: value -8 is outside the range
of representable values of type 'unsigned int'
[
octave_rand::set_internal_state(ColumnVector const&)
/home/pk/dl/octave-classdef-clang/build/liboctave/../../liboctave/numeric/oct-rand.cc:675
    tmp[i] = static_cast<uint32_t> (s.elem (i));
% Can be had with
> rand("state", -8)
]


Finally, and this one looks really rather dangerous,
* dim-vector.cc:101:9: runtime error: signed integer overflow: 3 *
-1094795586 cannot be represented in type 'int'

Unfortunately, I don't know where it comes from. At one point, "make
check" appears to hang and has to be Ctrl-C'd.


If this mail reads a bit like a dump without context, sorry for that
:). It's sort of a compound bug report I wanted to initiate some
discussion on. Because, if I elaborated, the mail would have been even
more TLDR. Feel free to ask for details.


Greetings,
Philipp


[1] http://blog.regehr.org/archives/963
[2] http://blog.regehr.org/archives/213
[3] http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
[4] http://clang.llvm.org/docs/AddressSanitizer.html
[5] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54027

Attachment: MAKECHECK_CLANG_SAN.log
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]