Re: [lmi] Numerics

lmi
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Numerics

From:	Greg Chicares
Subject:	Re: [lmi] Numerics
Date:	Mon, 9 May 2016 04:04:10 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0
On 2016-05-08 22:59, Vadim Zeitlin wrote:
[...moving the big questions to the top (terse answers elaborated below)...]
> 0. Whether you're sure we don't need ctor from dollars+cents.

Yes, I am sure we don't need that.

> 1. What would you prefer to call units/subunits.

Dollars and cents.

> 2. What API for rounding would you like to see.

No rounding API at all.

> On Sun, 8 May 2016 00:57:07 +0000 Greg Chicares <address@hidden> wrote:

[...ctor that takes dollars and cents as distinct arguments:
    // Constructor from a positive number of units and subunits. The subunits
    // argument must be normalized i.e. positive and strictly less than
    // subunits_per_unit.
    currency(amount_type units, int subunits)
...]

> GC> I think that ctor should be private because it is only an implementation
> GC> detail of operator>>(). I can't imagine calling it elsewhere. In practice
> GC> we always have a scalar amount-of-money, never a pair of {dollars, cents}.
> 
>  When hesitating about whether this ctor should exist or not, I thought
> about a GUI control for entering dollars and cents separately (e.g. a
> "masked text edit" with a fixed point before the last two positions). But I
> admit I don't know if it's really a good reason to provide it.

A masked edit control makes sense for mixed-base systems, but aside from
calendar dates, who uses them any more? The label on the turkey breast I
cooked for today's dinner said 6.30 pounds, not 6 pounds 5 ounces: even
though we haven't yet adopted the kilogram as a unit, we're moving toward
decimal weights. Well, Americans still measure their heights in feet and
inches, and separate fields make sense there: "[__] ft [__] in"; but those
units aren't decimal (you wouldn't give your height as two meters and five
centimeters--you'd say 2.05m, right?) But for decimal measurements in
general, and currency in particular, I've never seen separate fields for
units and subunits in any GUI.

(The masks for telephone numbers, GUIDs, and US Social Security numbers are
just punctuation and don't suggest any hierarchy of units and subunits.)

I've never entered currency amounts on a telephone keypad, but...
  
http://www.dslreports.com/forum/r10085604-Entering-dollars-and-cents-on-a-phone
...apparently either you enter amounts in cents, or you use the asterisk
as a decimal point.

When filing tax returns in the US...
  https://www.irs.gov/instructions/i8962/ar02.html
| the IRS electronic filing program provides for entries of dollars only
| ... If you file a paper return and do not round amounts to whole dollars,
| be sure to enter the decimal to separate dollars and cents.

I think this is some kind of financial software...
  https://community.intuit.com/questions/968741-how-do-i-enter-dollars-and-cents
| [question] how do i enter dollars AND cents
| [answer] 4.05 is four dollars and 5 cents
|   where specifically are you having a problem?
...from which I gather that typing the decimal point is seen as a
requirement that should be obvious to everyone.

For taxes again, one state says:
  http://www.dor.state.nc.us/downloads/nc59.pdf
| Do not enter a decimal to separate the dollars and cents. e.g.
| 123300 will be understood to be $1,233.00
but another says:
  http://labor.hawaii.gov/hiosh/files/2013/01/hioshwc1instruct.pdf
| Enter a decimal point to separate dollars and cents.
while a third says:
  
http://otr.cfo.dc.gov/sites/default/files/dc/sites/otr/publication/attachments/2012_d-40p_website_fill-in__101912.pdf
| Do not enter cents, enter dollars only.

Summarizing, real-world dollar-and-cents input methods seem to be:
  one field, integral units only, subunits forbidden; or
  one field, integral subunits; or
  one field consisting of: units [decimal-point, subunits];
but I find no evidence of masked $[9999].[99] input in the wild.

>  To be honest, my main motivation was probably the fact that I thought it
> was "natural" to want to create a currency amount from dollars and cents.
> If you think it isn't and if you don't think it's worth to provide it for
> the use from the GUI code in the future, I'm going to remove it -- but
> considering the time I already spent on writing/testing it, I'd appreciate
> a confirmation of this.

I'm sure lmi's end users would reject it. I present the evidence above in
the hope of convincing you that nobody would ever do this, at least not
in the US. Or anyplace else where gnumeric, open-office calc, or any
proprietary spreadsheet I've ever seen is used.

> GC> Can the "using" typedef and the "constexpr" things be made private?
> 
>  They could, but I think they should be public because it can be useful to
> this class clients. The constants probably won't be that much if the ctor
> from dollars+cents doesn't exist any more, but the underlying type still
> seems to be useful.

Are you arguing against premature encapsulation--preferring to keep these
implementation details public until the interface is settled and we can be
sure that we won't have to un-hide them in order to implement some function
we haven't thought of yet? If so, okay...but when the design is complete,
wouldn't we want to hide these details?

> GC> Oh, and, BTW, doesn't C++11 have static assertions that would let you
> GC> express this comment:
> GC>   static constexpr int subunits_per_unit = 100; // std::pow(10, 
> subunits_digits)
> GC> in code?
> 
>  Actually in real C++11 I could have just written the commented out
> expression and it would be evaluated at compile-time and this does work
> with g++ 4.9. Unfortunately it doesn't with MSVS 2015, so I did this
> temporarily and I hope we can keep it just to make my life slightly easier,
> especially because I hope it could be removed with a next version of MSVS
> (and maybe even with the Update 2 which I don't have yet, but I could test
> with it if it's important). But even if you really object to it, I'd rather
> just remove "100" and keep MSVS workaround locally, but I don't think it's
> worth using static_assert here.

Okay. It's certainly not confusing. It's just that when I see a comment,
I try to replace it with an assertion. Alternatively, if the comment
doesn't say anything that isn't obvious, I tend to remove it. Renaming
will make it obvious:
  static constexpr int cents_per_dollar = 100;
and I think that line needs no comment (and replacing the literal "100"
with a call to a library function would make it less clear). But this is
just a triviality.

> GC> I think these functions should be private:
> GC>   amount_type units() const
> GC>   int subunits() const
> 
>  Again, for me these methods are useful for a GUI showing the amount in
> dollars and cents.

Well...I suppose they might be useful in check-writing software that needs
to form strings like "One hundred twenty-three dollars and forty-five cents".
But...no, in that contrived case, natural language must replace digits. Do
you see any concrete use case that is really plausible?

> GC>   amount_type total_subunits() const
> 
>  This one is admittedly mostly an implementation detail but it seems pretty
> harmless to have and it makes writing tests much simpler. Could we please
> keep it?

If you really want to keep it, could it be private (with the unit-test class
declared as a friend)?

> GC> At any rate, what about:
> GC>   static constexpr double cents_to_dollars = 1.0 / subunits_per_unit;
> GC>   ...
> GC>   double value() const {return subunits_ * cents_to_dollars;}
> GC> ? Then the compiler performs conversion to double implicitly, and it
> GC> uses a multiplication instead of a division, which might still be
> GC> faster even in the twenty-first century. (Or maybe it's the opposite
> GC> because 100.0 has few significant binary mantissa bits, while 0.01
> GC> has many; unit tests would inform us.)
> 
>  I am almost sure it's not going to make absolutely any difference and the
> compiler will do whatever it considers to be best, but I'll benchmark it
> just in case.

I'm almost sure I'll be surprised by the benchmark results. The strategy
you propose satisfies us both.

> GC> In this case [reformatted as a single line]
> GC>   currency() {}
> GC> is it not necessary to initialize 'subunits_' explicitly to zero?
> 
>  No, this member has a default initializer, see
> 
>       https://github.com/vadz/lmi/blob/currency-class/currency.hpp#L180

Thanks, I completely missed that.

> FWIW I think it's a good idea to always provide default initializers for
> all non-object fields in C++11, even if there is just a single ctor
> initializing them anyhow because it makes it impossible to leave a field
> uninitialized, e.g. by adding another ctor later.

And in C++11, as a corollary, is it a recognized "best practice" *not* to
re-specify them in any ctor that uses the same default value?

> GC> Is there a reason to use std::trunc() to test overflow, but
> GC> std::round() for the conversion
> 
>  Yes: trunc() is used for limit checking to ensure we don't overflow the
> maximal amount_type value by multiplication (although I'm almost sure that
> this would still be true even with round(), but using trunc() is supposed
> to make this more obvious -- apparently not quite successfully), while
> round() is used to give the expected result when creating currency from
> non-integral-cents amounts.
> 
> GC> i.e., is there room for an error to slip through between the slightly
> GC> different semantics of those two library functions (depending on the
> GC> hardware rounding-direction bit)?
> 
>  Actually neither trunc() nor round() are supposed to depend on the
> hardware rounding more in C++11. Whether it's the case in practice is
> another question, of course...

Yes, I forgot that round() ignores the hardware rounding bit (as trunc()
obviously must). However, let me re-ask my question without that misleading
parenthetical. Suppose we call from_value() with a 'double' argument D that
is fractionally greater than max_units(), and the fractional difference
is large enough to cause the result to be rounded up. Then
 - the overflow test passes: trunc(D) == max_units()
 - the rounded result may overflow: round(D) > trunc(D) == max_units
Doesn't that example incorrectly elude the overflow test?

> GC> I'm not comfortable using the MinGW-W64 implementation of std::round()
> GC> because it doesn't pass the 'round_test.cpp' unit tests that I wrote for 
> the
> GC> mingw.org implementation (which does pass).
> 
>  Oops, this is bad news and I was completely unaware of it. I hoped we
> could finally abandon workarounds for old compilers with C++11, it's
> disappointing to learn that we still can't do it.

I'd be glad to retire my old rounding code in favor of a standard
implementation that's correct.

> GC> At least until the MinGW-W64 std::round() implementation is validated
> GC> or corrected, I prefer to use lmi's 'round_to.hpp' facility.
> 
>  Which of the rounding functions defined there would you prefer to use
> here?

None, actually, because of the discussion immediately below. IOW, in
my stream-of-consciousness reply, I first objected to a faulty
implementation of std::round(), but then I discovered a deeper
objection to even a correct implementation.

> GC> Is it actually a good idea for from_value(double d) to perform rounding?
> 
>  I thought about this and decided that this was the most reasonable choice
> because otherwise we would need to arbitrarily fix the "maximal allow
> distance from the nearest integer" and I don't see any good way to do it.

My objection is that it is often the wrong choice, and that only the
caller can make the correct choice; and that forces us to select a
tolerance, arbitrary though that may be. Arbitrary is better than wrong;
throwing an exception is better than completing an incorrect calculation.

> GC> With frequency in (usually,always] its argument should be the closest
> GC> representable double <d> to the "true" value <D> that's really meant; 
> thus:
> GC>   D = 1.23 (the real number 123/100)
> GC>   d = 1.229999999999996
> GC> or at least a very near double, say within a relative error of 1E-14
> 
>  Well, this is the problem: where does this value come from exactly? 1 ULP
> is not enough as more significant errors can easily accumulate and anything
> that is neither 0 nor 1 (nor e nor pi, but those don't seem appropriate
> here) is arbitrary.

Python makes an arbitrary choice:
  https://www.python.org/dev/peps/pep-0485/#defaults
and various APL implementation make their own:
  http://www.jsoftware.com/papers/satn23.htm
but they make this engineering decision thoughtfully.

Of course, we can make the tolerance an optional argument, as is done
in lmi's 'materially_equal.hpp' (and I've actually called it with
nondefault arguments in 'math_functors_test.cpp'):

inline bool materially_equal
    (long double t
    ,long double u
    ,long double tolerance = 1.0E-13L
    )

See the comments preceding that definition, which even cite TAoCP
(which gave me occasion to open that book twice in one nychthemeron,
only to find that Knuth doesn't discuss how to choose a value for
his ε, alas).

> GC> As a concrete example, accumulate $123.45 at an
> GC> interest rate specified to eight decimals:
> GC>   12345 cents * (1 + .0024662698) = 12375.446100681 = how many cents?
> 
>  For me the answer is "45" and I don't know what else can it be. Of course,
> I could be just very naïve.

I won't say you're necessarily naive, but you certainly aren't performing
calculations under section 7702 of the US Internal Revenue Code, where
the direction of rounding is crucial, and is not the same for all
calculations.
  http://www.nongnu.org/lmi/7702.html
| It is critical that the result [in one particular case] be rounded up
| if at all, and never rounded down or truncated. ... A §7702(f)(8) waiver
| granted in one actual case that was pennies over the limit cost tens of
| thousands of dollars in filing and attorney’s fees.

> GC> In actual practice, we specify whether that's to be rounded up, down, or
> GC> to nearest, and to how many decimals, so these outcomes seem desirable:
> GC>   factor u = 1 + .0024662698;
> GC>   currency x = 123.45; // Okay.
> GC>   x *= u;    // Error: rounding details not specified.
> GC>   x = x * u; // Error: rounding details not specified.
> GC>   x = round_to_currency_some_specific_way(x * u) // Okay.
> 
>  Notice that with the current class you can't do "x*u" at all (only
> multiplication by an integer is supported)

Ah, yes, I forgot that. The integers are closed under multiplication
(except for overflow, which should be trapped), so floating-point
"fuzziness" won't arise with this function because multiplicand and
multiplier are both integers.

The problem was misstated; I should have written:
  double factor u = 1 + .0024662698;
  currency x = 123.45;      // Okay.
  double z = u * x.value(); // Okay: 123.75446100681
  x = from_value(z);        // Error: rounding details not specified.

In one circumstance, I might need the floor 123.75; in another, I
might need the ceiling 123.76 . If from_value() chooses a rounding
direction for me, then the answer must be wrong in one of those
circumstances.

> so my from_value() is basically
> just your round_to_currency_some_specific_way() which always rounds to the
> nearest cent, away from zero in halfway cases.

Let me rewrite that to make my point clear:

  double z = u * x.value(); // As above: 123.75446100681
  currency x0 = from_value(z); // Error: rounding details not specified.
  currency x1 = from_value(round_up_to_cents(z); // Okay.
  currency x2 = from_value(round_down_to_dollars(z); // Okay.

Those are equivalent to:

  currency x0 = from_value(123.75446100681); // Error: which integer?
  currency x1 = from_value(123.76); // Okay. [see note]
  currency x2 = from_value(123.00); // Okay.

Note: "123.76" above is really something like
  123.75999999999997
but it's "near enough" to an integer (12376) with the decimal point shifted
two places to the left. The same cannot be said of 123.75446100681 .

>  If we need other rounding modes, I think they should be provided by this
> class itself, e.g. from_value() could take rounding_style

It might be better to write supplemental rounding functions that take
double arguments and return currency.

>, although this is
> not without its own questions: should it have a default value?

No, never: "in the face of ambiguity, refuse the temptation to guess".

> What should
> happen in r_not_at_all case?

We could write:
  currency round_not_at_all(double d) {throw currency_error;}
IOW, the result of this rounding-that-is-no-rounding must be of type double,
and in general cannot be converted to currency.

In this case, we would have found a logic error in the program, which could
only be addressed by ferreting out and fixing the underlying error.

> GC> Can I commit it as is, without bruising your older patches? If that would
> GC> require rebasing older patches, then I'd rather just work through all of
> GC> them in order, once we're finished revising group quotes.
> 
>  This can, of course, be committed as is, it's completely independent of
> all the rest, but it seems there are quite a few changes to be made (and at
> least one obvious bug with the negative amounts in from_value()), so
> perhaps it would be better to make them first? Of course, I could do them
> later too, so it doesn't really matter as long as we both agree what are we
> doing. Please let me know what would you prefer

I'll wait a while.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [lmi] Numerics, Greg Chicares, 2016/05/02
- Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/02
  - Re: [lmi] Numerics, Greg Chicares, 2016/05/02
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/07
    - Re: [lmi] Numerics, Greg Chicares, 2016/05/07
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/08
    - Re: [lmi] Numerics, Greg Chicares <=
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/09
    - Re: [lmi] Numerics, Greg Chicares, 2016/05/09
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/13
    - Re: [lmi] Numerics, Greg Chicares, 2016/05/14
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/14
    - Re: [lmi] Numerics, Greg Chicares, 2016/05/14
    - Re: [lmi] Numerics, Vadim Zeitlin, 2016/05/14
Prev by Date: Re: [lmi] Numerics
Next by Date: Re: [lmi] Group quotes, part deux
Previous by thread: Re: [lmi] Numerics
Next by thread: Re: [lmi] Numerics
Index(es):
- Date
- Thread