lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Numerics


From: Vadim Zeitlin
Subject: Re: [lmi] Numerics
Date: Mon, 28 Mar 2016 00:41:42 +0200

On Thu, 24 Mar 2016 21:20:54 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-03-18 00:56, Vadim Zeitlin wrote:
GC> [...]
GC> > [*] See 
https://randomascii.wordpress.com/2013/02/07/float-precision-revisited-nine-digit-float-portability/
GC> 
GC> And now the expression
GC>   0.07 * (1.0 + 1.0 * epsilon)
GC> doesn't seem to equal
GC>   0.07 × (1 + ε)
GC> with gcc-4.9.1 ; see:
GC>   http://lists.nongnu.org/archive/html/lmi-commits/2016-03/msg00008.html
GC> for a unit test that always worked with gcc-3.4.5, but failed with gcc-4.9.1
GC> until I substituted a literal for that expression. I doubt gcc is wrong.

 Yes, I think that gcc (4.9) is indeed correct and it's the test itself
that was wrong. I don't really understand where did the idea that
multiplying by "1.0 + epsilon" (I discard the "1.0*" before epsilon as it's
completely optimized away by the compiler anyhow) is supposed to give the
"next" floating point number come from, but it's clearly not true in
practice, at least when using x87. If you're curious, the (optimized)
disassembly for this operation looks like this:
---------------------------------- >8 --------------------------------------
fld    DWORD PTR ds:0x441c64    ; 0x3fcb8000000000000000
fstp   QWORD PTR [ebp-0x38]
fld    QWORD PTR [ebp-0x38]
fadd   DWORD PTR ds:0x441c28    ; += 0x3f800000
fmul   QWORD PTR ds:0x441c10    ; *= 0x3fb1eb851eb851ec -> 
0x3ffb8f5c28f5c28f68f6
fld    QWORD PTR [ebp-0x20]     ; 0x3ffb8f5c28f5c28f5800
fxch   st(1)
fucomi st,st(1)
jb     0x40b430 <tn_range_test::test()+128>
fld    QWORD PTR [ebp-0x18]     ; 0x3ffb8f5c28f5c28f6000
fucomip st,st(1)
---------------------------------- >8 --------------------------------------
where I added some comments with the actual values used. The important
thing is that the computed value, 0x3ffb8f5c28f5c28f68f6 is strictly
greater than 0.070000000000000001 which has "...6000" at the end, so the
comparison inside is_valid() fails.

 Now it is true that if both these values, which use "extended precision"
of x87, are converted to the standard "double precision", as it happens if
they're written to memory, they do become the same value due to the loss of
precision. And, speaking about Murphy mentioned in a neighboring thread, I
wasted more time than I would dare to admit on debugging why did I see the
test failure (prior to the changes of r6523) with the native builds but not
the cross-builds. It turned out, finally, that it was due to not using -O2
for the cross-builds, see r6537, and that without it the generated code
wasn't optimized [enough] and it did store the x87 register value in memory
and then reloaded it -- making the test pass. But, of course, this is not
guaranteed to happen and, in practice, doesn't when using -O2 or higher.

 For the same reason, using an intermediate (volatile, so that it's not
optimized away) variable could work too, i.e. the following patch makes the
test pass for me too and could replace r6523 if you prefer it:
---------------------------------- >8 --------------------------------------
diff --git a/tn_range_test.cpp b/tn_range_test.cpp
index 1447a72..115e5ab 100644
--- a/tn_range_test.cpp
+++ b/tn_range_test.cpp
@@ -435,13 +435,14 @@ void tn_range_test::test()
     BOOST_TEST( surd0.is_valid( 0.070000000000000001));
     BOOST_TEST(!surd0.is_valid( 0.0700000000000001  ));

-    BOOST_TEST( surd0.is_valid( 0.07 * (1.0 + 1.0 * epsilon)));
+    z = 0.07 * (1.0 + 1.0 * epsilon);
+    BOOST_TEST( surd0.is_valid(z));
     BOOST_TEST( surd0.is_valid( 0.07 / (1.0 + 1.0 * epsilon)));

     // If exactly four values are permissible, then exactly one of
     // these is permissible.
     BOOST_TEST
-        (   surd0.is_valid( 0.07 * (1.0 + 2.0 * epsilon))
+        (   surd0.is_valid(z)
         ^   surd0.is_valid( 0.07 / (1.0 + 2.0 * epsilon))
         );

---------------------------------- >8 --------------------------------------

 Finally notice that this also happens to work in 64 bit builds or if we
enabled SSE support for 32 bit builds (I made a patch for this many years
ago...) because XMM registers only use double, not extended, precision.


 But IMO, to find the next double, it would be better to rely on its
representation mandated by IEEE-754 and do the usual trick with casting to
int64_t and incrementing it and then casting back to double. This is
guaranteed to work, unlike multiplying by epsilon. Of course, hard coding
the values, as r6523 does, works too, but I'd get rid of the remaining uses
of epsilon too.

 Regards,
VZ

reply via email to

[Prev in Thread] Current Thread [Next in Thread]