I am writing to comment on what I view as erroneous and/or misleading
behavior of gm2's implementations of WriteXXX() functions for
SHORTREAL, REAL, and LONGREAL datatypes. To keep this note shorter, I
have placed test Modula-2 programs, and their output produced on a
GNU/Linux x86_64 system, at this site:
http://www.math.utah.edu/~beebe/modula-2/
The gm2 version used at that site is from a build of the first gcc-15
snapshot that appeared on 24-Apr-2024. Since then, gcc-14.1.0 has
been officially released, the first such since August of last year.
I do not have access to the final version of ISO/IEC 10514-3:1998, the
international standard for Modula-2. Instead, I have an earlier June
1994 draft from
ftp://ftp.psg.com/pub/modula-2/wg13.dis.tar
However, any differences between the draft and final specifications
are unlikely to be relevant for my points. I refer in particular to
the draft specification of WriteFloat on sequential page 532 (numbered
page 445, section 9.2.2.3.3) where it says:
...
The call WriteFloat(cid, real, sigFigs, width) shall write the value of
real to the output stream identified by cid in floating-point text
form, ... if the value of sigFigs is greater than zero, that number of
significant digits shall be included, otherwise an
implementation-defined number of significant digits shall be included.
...
Two pages later, the specification of WriteReal() refers back to that
for WriteFloat() with the text
...
... the call WriteReal(cid, real, width) shall behave as the call
WriteFixed(cid, real, place, width), with a value of place chosen
to fill exactly the remaining field. Otherwise the call shall behave
as the call WriteFloat(cid, real, sigFigs, width), with a value of
sigFigs of at least one, ...
...
Here are the issues that I noted:
-----------------------------------------------------------------------
(1) The second nonblank output line for the three floating-point
formats looks like these:
eps64 = 1.0842021724855044E-19 = = 2**(-63)
eps64 = 2.2204460492503131E-16 = = 2**(-52)
eps32 = 1.19209290E-7 = 0.000000 = = 2**(-23)
In each case, the empty value between the two final = signs comes from
calls like this one in the REAL case:
WriteReal (StdChans.StdOutChan(), eps, 0)
It seems MUCH more useful to instead display a suitable default number
of digits when sigFigs = 0, such that round trip conversion from
binary to decimal to binary recovers the original binary value inside
the computer. This topic was studied by David Matula and Bennett
Goldberg in the 1960s, in papers that you can find here:
https://www.math.utah.edu/pub/tex/bib/cacm1960.bib
https://www.math.utah.edu/pub/tex/bib/cacm1960.html
[they look almost identical on the screen, except that the first is
for BibTeX, and the second is for browsers, and has live hyperlinks].
ACM has made journal archives from their first 50 years available for
free download, so anyone should be able to get the cited papers via
the DOI values in their BibTeX entries.
Their findings are different from what most people then thought, and
clearly continue to think. In particular, these two hoc functions
return the required digit counts:
### matula(nbits): return number of decimal digits needed to ensure
### correct round-trip conversion between binary and decimal
func matula(nbits) { return ceil(nbits/log2(10) + 1) }
### goldberg(ndecdig): return number of bits needed to ensure correct
### round-trip conversion between binary and ndecdig decimal digits
func goldberg(ndecdig) { return ceil(ndecdig/log10(2) + 1) }
For the four IEEE 754 binary formats, here are the Matula counts:
--------------------------------
format decimal digits
--------------------------------
binary32 9
binary64 17
binary80 21
binary128 36
--------------------------------
On x86_64 systems then, gm2 should use replacement sigFigs values of
9, 17, and 21 for the SHORTREAL, REAL, and LONGREAL types. Those
values can also be caps on the counts produced when larger sigFigs
values are requested by user code.
gm2's Write*Real() functions produce 17, 37, and 45 digits for those
formats, after which, trailing zeros are supplied.
I'm working on getting access to a working gm2 for systems that use
the IEEE 754 binary128 format for LONGREAL, but cannot yet report
results for those platforms.
------------------------------------------------------------------------
(2) The output line for eps32 above shows a result of 0.000000 from
WriteShortReal (eps, 0); that would be more sensibly displayed in
exponential form to show that it is nonzero.
------------------------------------------------------------------------
(3) When sigFigs = 0, all three test programs produce rubbish output
from WriteFloat():
sigFigs = 0 eps32 =
11920928955078125000000000000000000000000000000000000000000000000000000000000000000000000000000000000
sigFigs = 0 eps64 =
22204460492503130808472633361816406250000000000000000000000000000000000000000000000000000000000000000
sigFigs = 0 eps80 =
10842021724855044340074528008699417114257812500000000000000000000000000000000000000000000000000000000
Once again, the Matula defaults would be far more sensible here.
------------------------------------------------------------------------
It is not only gm2 that has such issues. Below is a snippet (in LaTeX)
from a book that I'm writing about a new math library that I'm
developing. It describes how widespread is the ignorance of the
Matula and Goldberg formulas published more then 55 years ago in
possibly the most widely read journal in the field of computer
science:
...
Regrettably, output conversions without precision specifications in
both \X{Fortran} and \X{C} default to 6 fraction digits for
floating-point output, whereas Matula mandates 9 for the 32-bit IEEE
format, 10 for older 36-bit formats with 27-bit fractions, and 17 for
the 64-bit IEEE format. Such defaults are likely to be relied on by
many users, who do not understand that subsequent input of those
truncated values will \emph{not} reproduce the numbers that were in
the computer, because they lose 3 or more trailing bits.
More recent programming languages are little better. For the IEEE 754
32-bit and 64-bit binary formats, without specifying an output
precision, \X{COBOL} produces 8 and 16 decimal digits, \X{C\#} outputs
7 and 15, and \X{Go}, \X{Java}, and \X{Rust} output 8 and 17.
Some languages only provide the 64-bit format: the \X{Awk}, \X{PHP},
and \X{Python} scripting languages all output 6 digits,
\X{Algol 68 Genie}, \X{LibreOffice}, and \X{Perl} output 15 digits,
\X{Matlab} and \X{Octave} output 5 and 16 in \code{short} and \code{long}
formats,
\X{GNU} \X{Pascal} outputs 16 digits, and
\X{JavaScript} and \X{MetaPost} produce the Matula-mandated 17 digits.
Given the age of some programming languages,%
\footnote{\X{Fortran} appeared first in 1956, \X{COBOL} in 1959,
\X{Algol 68-R} and \X{Pascal} in 1970, \X{C} in 1972,
and \X{C++} in 1979.}
and given that several of them are defined by international standards
written by presumed experts in the languages and their practical
implementation, we have considerable evidence that correct round-trip
base conversion has been generally ignored, and the chosen defaults
are simply \emph{wrong}. This author's \pgm{hoc} implementations,
however, all follow the Goldberg%
\index{Goldberg, Bennett}
and Matula%
\index{Matula, David [American mathematician and computer scientist]}%
\index{United States}
formulas.
...
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah -
- Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu -
- 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------