lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] Converting numbers in mortality tables to and from text


From: Vadim Zeitlin
Subject: [lmi] Converting numbers in mortality tables to and from text
Date: Fri, 18 Mar 2016 01:56:00 +0100

 Hello,

 While doing final tests before submitting the new table_tool patch I ran
into a problem which seems relatively minor to me, but which I'd still like
to discuss here because I have a suspicion that Greg might have a different
opinion about it.

 In short, the problem arises in the tests checking that round-tripping a
binary table through the text form and back to binary produces the same
values: in fact, for some values, the values before and after this round
trip do differ.

 The reason I don't think it's very important is that the difference is
never more than 1 ULP (unit-in-the-last-place) and so can only appear in
the 15th or possibly 16th digit while the numbers used in the tables have
at most 9 significant digits, so it doesn't seem like this difference can
ever change anything in practice.

 However, even so, I'd, of course, prefer to obtain exactly the same
numbers after the round trip. The problem is that I just don't know how to
do it because it seems that at least the tables I have contain numbers not
corresponding to their textual values. To take an example, the table number
214 in qx_ins database ("1977-83 Malaysia"), contains the value "0.00153"
for the age 20. The closest representation of this number as a IEEE-754
double precision number is, using the C99 floating point number notation,
0x1.91148fd9fd36fp-10. Alternatively, its IEEE-754 representation is
3f591148fd9fd36f (exponent -10, mantissa 91148fd9fd36f). However the binary
table actually contains the number differing from this by 1 ULP, namely
0x1.91148fd9fd370p-10 or 3f591148fd9fd370: the last bit is wrong. Of
course, this number still gives "0.00153" when converted to text with 5
digits of precision, but we're not going to get it back when parsing
"0.00153" -- we'll always convert it to the (correct) 0x1.91148fd9fd36fp-10
instead. So we can compare the textual representations of the tables, but
comparing their binary values fails.

 And I don't see what can we do about it. Just adding 1 ULP to all numbers
is, of course, not a solution, because most of them are actually correct in
the existing tables, there are just a few (but not that few) exceptions. I
strongly suspect that the program originally used to create these files had
a (minor) bug in its text-to-floating-point conversion routines because
such bugs were pretty common, e.g. MSVC had them until its 2013 version[*].
Unfortunately I have no idea what was this bug exactly and so have very
little chance of finding a way to reproduce it. I did try doing my own
naive implementation by just multiplying the fractional part by
std::pow(10, -number_digits) and, amazingly, this does give the "right
wrong" result for this particular number. However it gives different
results for other numbers... I could continue trying to find a way to
reproduce exactly the behaviour which was used by the program that created
the existing files, but I'm not very optimistic about being able to do it
and, of course, I don't really think it's worth it neither.


 Do you see some other way to avoid this problem that I'm missing? Or are
we ready to live with the current behaviour? Notice that the program will
round trip the files created by itself correctly (barring bugs in the
standard library of the compiler used to build it!), this "loss" of
precision only happens when using the existing files.

 Please let me know what do you think and, most importantly, whether it's
worth spending more time on this or if we can ignore this problem.

 Thanks in advance,
VZ

[*] See 
https://randomascii.wordpress.com/2013/02/07/float-precision-revisited-nine-digit-float-portability/
    for a discussion of it if you're curious but, in short, MSVC got the
    last digit wrong for exactly 4 single precision floating point numbers)
    and differently from gcc in another ~7 million cases.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]