[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Interesting floating point behavior
From: |
Nelson H. F. Beebe |
Subject: |
Re: [bug-gawk] Interesting floating point behavior |
Date: |
Fri, 20 Jan 2012 10:20:20 -0700 (MST) |
Robert Kennedy <address@hidden> reports puzzlement over the
awk computation "a=$1; b=a*10000; c=b%100" that produces these values:
>> 0.69 6900 100
Here is what is happening:
In 128-bit binary IEEE 754 arithmetic:
hoc128> a = 0.69
hoc128> b = a * 10000
hoc128> c = b % 100
hoc128> println a, b, c
0.69 6_899.999_999_999_999_999_999_999_999_999_998_42 99
In 128-bit decimal IEEE 754 arithmetic:
hocd128> a = 0.69
hocd128> b = a * 10000
hocd128> c = b % 100
hocd128> println a, b, c
0.69 6_900 0
Because most decimal fractions, like 0.69, are not exactly
representable in binary arithmetic, you often see the string-of-9s
phenomenon when you do the inexact round-trip decimal -> binary ->
decimal.
This has nothing to do with gawk: it is a fact of life that arises
from inexact base conversion.
A famous example used to illustrate the need for decimal arithmetic is
sales tax computation: 5% on a purchase of $0.70: in decimal
arithmetic, the answer is 1.05 * 0.70 = $0.735, and tax man's rounding
says you owe $0.74.
In binary arithmetic, 0.70 is not exactly representable, no matter
what your precision is, and the computation produces
0.734_999_999_999_999_99, which rounds down to 0.73, cheating the tax
authorities of 0.01.
They DO care about this, and in most jurisdictions, such arithmetic
MUST be done in decimal.
While a difference of a penny is insignificant when you buy a Ferrari,
it can add up to millions of dollars annually in businesses that have
large numbers of small transactions, like telephone companies and
grocery stores.
IEEE 754-2008, the revision of IEEE 754-1985, includes decimal
arithmetic, and additional rounding rules demanded by tax laws (e.g.,
round-ties-upward: 0.735 -> 0.74). So far, only IBM z-Series and IBM
PowerPC 7 have full support of the 2008 standard.
I have versions of mawk and nawk that use decimal arithmetic instead
of binary arithmetic: for them, Robert's experiment produces this
output:
echo -e "Input\t*10000\t%100"; \
for i in 0.67 0.68 0.69 0.70; do \
echo $i | dmawk '{a=$1; b=a*10000; c=b%100; print
a,"\t",b,"\t",c}'; \
done
Input *10000 %100
0.67 6700 0
0.68 6800 0
0.69 6900 0
0.70 7000 0
They are built with the 128-bit decimal format, which supplies exactly
34 decimal digits. Here is a computation of the machine epsilon,
which should be 10**(-34 + 1):
% dmawk -f macheps.awk
machine epsilon = 1e-33 = 10**-33
machine epsilon = 1e-33 = 10**-33
Versions of gcc with support for decimal arithmetic, and binary
packages with hoc, dmawk, dnawk, and dlua are available here:
http://www.math.utah.edu/pub/mathcw/
My large book on that library is essentially done, with some minor
tweaks in progress before going to the publisher.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: address@hidden -
- 155 S 1400 E RM 233 address@hidden address@hidden -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------