Re: [lmi] Numerics

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Numerics

From:	Vadim Zeitlin
Subject:	Re: [lmi] Numerics
Date:	Thu, 31 Mar 2016 23:49:45 +0200

On Thu, 31 Mar 2016 17:01:36 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-03-28 22:08, Vadim Zeitlin wrote:
GC> > On Mon, 28 Mar 2016 21:49:38 +0000 Greg Chicares <address@hidden> wrote:
GC> > 
GC> > GC> (Here's the motivation. If a textcontrol is to accept an interest 
rate in
GC> > GC> a range like [0.03, 0.07], and treat input outside that range as 
invalid,
GC> > GC> then it would be naive to compare the input directly to either bound: 
if
GC> > GC> "0.07" is correctly entered and the machine translates that text 
string to
GC> > GC> 0.07000000000000001 (e.g.), and compares it to a bound represented 
as, say,
GC> > GC> 0.06999999999999999, then we shouldn't reject it, because it's the 
user's
GC> > GC> best attempt at specifying the upper-limit interest rate. Therefore, 
the
GC> > GC> bounds are "adjusted" outward by one ulp.)
GC> > 
GC> >  Unless I'm missing something, the naïve solution should work as long as
GC> > the values are rounded up to the appropriate number of digits. Wouldn't
GC> > this be simpler?
GC> 
GC> Which values would you round up--input values, or bounds?

 The inputs.

GC> If input: there is no explicit maximum number of digits.

 Surely we can impose some? It's not like double has unlimited precision
anyhow, even if it should, in practice, be enough for anything lmi does
AFAICS. So I think we could just round them up to 10 or 12 digits -- and
the checks should still work, I think.

GC> It would be nice if reading "0.07" from a file into the upper bound
GC> produced exactly the same value as reading "0.07" from a text control
GC> into an interest rate, but we have observed that it may not.

 If "0.07" is represented as text in both cases, I don't see how can it be
possible. If it's represented in binary form on disk, then I still think
the maximal difference between them is just one ULP and so would disappear
after rounding.

GC> I can't explain why that happened, but I did need to fix it. With this
GC> technique, that problem has not been observed.

 This (i.e. that everything works correctly currently) is indeed a strong
argument...


GC> > GC> More idealistically, we should probably use integral cents as our 
basic
GC> > GC> currency unit instead of floating-point dollars rounded to the closest
GC> > GC> approximation to integral hundredths, because in the real world we can
GC> > GC> have exactly seven cents, and (double)(0.07) is not exactly the same.
GC> > 
GC> >  Yes, currency amounts are a classic example of things not to use floating
GC> > point numbers for.
...
GC> > but it would be clearly better to avoid it if possible. Please let me
GC> > know if I should make an issue for this too.
GC> 
GC> If you can think of a tidy strategy for resolving it.

 I don't have any magic solution to the problem of determining which
doubles represent currency amounts and which don't and I don't think one
exists.

GC> In principle, all we have to do is identify which variables, arguments, and
GC> return types are currency, and scale them all by 100, so that currency 
amounts
GC> could be represented as integral cents rather than fractional dollars. 
(Scale
GC> them internally only, that is: end users won't enter for $12.34 as "1234", 
and
GC> customers won't accept "1234¢" on output.)

 Yes, of course.

GC> Glancing at only one header ('account_value.hpp'), I see many things like
GC> 
GC>   double InforceLivesEoy () const;
GC>   double  SepAcctPaymentAllocation;
GC> 
GC> where "double" is correct, and many others, e.g.,
GC> 
GC>   double GetSepAcctAssetsInforce () const;
GC>   void IncrementAVProportionally(double);
GC>   double  SepAcctValueAfterDeduction;
GC> 
GC> where "double" should really be a currency type.

 This is the problem, from my point of view: it's not clear to me which
ones are which (well, I guess the first one is relatively obvious as lives
are not supposed to be measured in cents, but all the rest are less so).

GC> It wouldn't be too hard to go through all the code and replace "double" with
GC> "currency" as appropriate.

 If you could do it, with currency just as a typedef for double, I could
then reimplement currency as a class and do the tests. But it's this "as
appropriate" part which is the most complicated and time-consuming IMHO.

GC> But I'm sure I'd make some mistakes. How could I reliably treat them
GC> all, so that I don't break the system?

 Other than count on the tests to catch them, I don't see anything.

GC> Here's an idea for checking that the types are all correct: use the
GC> compiler as as a type-enforcement tool. As an intermediate step (not
GC> for production release) we might replace the typedef with a UDT like:
GC> 
GC>   class currency
GC>   {
GC>     public:
GC>       currency();
GC> 
GC>       // Probably don't define these, to avoid implicit conversions
GC>       // currency(double);
GC>       // operator=(double);
GC>       // operator double();
GC> 
GC>       operator+(currency); // ...and other additive operations
GC>       operator*(double);   // ...and other multiplicative operations
GC> 
GC>     private:
GC>       double value_;
GC>   }
GC> 
GC> which would have (almost) the same semantics as double, but would be a 
distinct
GC> type with no implicit conversions. Probably most operations are additive:
GC>   currency = currency + currency - currency;
GC> and most of the rest are multiplicative:
GC>   currency = currency * double / double;
GC> 
GC> With such a framework, we could even write iostream inserters and extractors
GC> that scale by 100, and run available regression tests to make sure 
everything
GC> is correct.

 Yes, definitely.

GC> Then, for production, we'd push the scaling into the code outside this
GC> "currency" class, and then either go back to

 Why should we go back to anything? IMO currency should be represented by a
class. The only thing I'd change would be the type of "value_" inside the
currency class, which would become int64_t[*] in the final version.

GC>   typedef double currency; // integers stored as double
GC> or use something else like
GC>   typedef int64_t currency;
GC> or even try both and see which runs faster. I think we ultimately want one
GC> of these builtin types for speed and simplicity.

 I'm pretty sure there will be no abstraction penalty for using the
currency class, any contemporary compiler is able to "look inside" such
class and see that it's just an int and generate exactly the same code for
it as if it were just an int in the first place.

 As for simplicity, the only "complexity" this class brings is the absence
of implicit conversions, but IMO the extra safety largely compensates for
this.

GC> We really mustn't make lmi run any slower. Perhaps we could use
GC> expression templates to keep currency as a class without a speed
GC> penalty, but that wouldn't be simple.

 I don't see any need for expression templates here, there is no need to
delay the computation as there are no extra memory allocations involved and
all intermediate results are just ints (or doubles).

GC> Or maybe this idea of temporarily introducing a "currency" class is more
GC> trouble than it's worth?

 No, I'm quite convinced that having a currency class is a good thing in
its own right, even though using double for currency amounts is wrong,
using ints is not right neither and I never considered doing it like this,
sorry for being unclear. Details may vary (e.g. which operations to provide
and so on), but having a wrapper class seems like the obviously correct
thing to do to me.

 I may be becoming impossibly conservative and fundamentalist in my old
age, but I see almost any use of primitive types as a code smell. The worst
offenders are usually booleans (which should be almost invariably replaced
with enums), but things are not much better with the strings (what could be
the possible purpose of using the same type for representing e.g. user
names and passwords other than to allow mixing them up?) and the numbers.
Using separate types for different entities doesn't cost much but makes
entire classes of errors completely impossible.

 Regards,
VZ

[*] I won't bring unsigned into this discussion...

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [lmi] Converting numbers in mortality tables to and from text, (continued)
- Re: [lmi] Converting numbers in mortality tables to and from text, Greg Chicares, 2016/03/17
  - Re: [lmi] Converting numbers in mortality tables to and from text, Vadim Zeitlin, 2016/03/18
    - Re: [lmi] Converting numbers in mortality tables to and from text, Greg Chicares, 2016/03/19
    - Re: [lmi] Converting numbers in mortality tables to and from text, Vadim Zeitlin, 2016/03/19
    - Re: [lmi] Converting numbers in mortality tables to and from text, Greg Chicares, 2016/03/19
- [lmi] Numerics [Was: Converting numbers in mortality tables to and from text], Greg Chicares, 2016/03/24
  - Re: [lmi] Numerics, Vadim Zeitlin, 2016/03/27
    - Re: [lmi] Numerics, Greg Chicares, 2016/03/28
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/03/28
    - Re: [lmi] Numerics, Greg Chicares, 2016/03/31
    - Re: [lmi] Numerics, Vadim Zeitlin <=

Prev by Date: Re: [lmi] Numerics
Previous by thread: Re: [lmi] Numerics
Next by thread: [lmi] Clang fixes
Index(es):
- Date
- Thread