[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on gm2 floating-point output

From: Nelson H. F. Beebe
Subject: Comments on gm2 floating-point output
Date: Thu, 9 May 2024 17:04:50 -0600

I am writing to comment on what I view as erroneous and/or misleading
behavior of gm2's implementations of WriteXXX() functions for
SHORTREAL, REAL, and LONGREAL datatypes.  To keep this note shorter, I
have placed test Modula-2 programs, and their output produced on a
GNU/Linux x86_64 system, at this site:

The gm2 version used at that site is from a build of the first gcc-15
snapshot that appeared on 24-Apr-2024.  Since then, gcc-14.1.0 has
been officially released, the first such since August of last year.

I do not have access to the final version of ISO/IEC 10514-3:1998, the
international standard for Modula-2.  Instead, I have an earlier June
1994 draft from

However, any differences between the draft and final specifications
are unlikely to be relevant for my points.  I refer in particular to
the draft specification of WriteFloat on sequential page 532 (numbered
page 445, section where it says:

>> ...
>> The call WriteFloat(cid, real, sigFigs, width) shall write the value of
>> real to the output stream identified by cid in floating-point text
>> form, ... if the value of sigFigs is greater than zero, that number of
>> significant digits shall be included, otherwise an
>> implementation-defined number of significant digits shall be included.
>> ...

Two pages later, the specification of WriteReal() refers back to that
for WriteFloat() with the text

>> ...
>>  ... the call WriteReal(cid, real, width) shall behave as the call
>> WriteFixed(cid, real, place, width), with a value of place chosen
>> to fill exactly the remaining field. Otherwise the call shall behave
>> as the call WriteFloat(cid, real, sigFigs, width), with a value of
>> sigFigs of at least one, ...
>> ...

Here are the issues that I noted:


(1) The second nonblank output line for the three floating-point
formats looks like these:

        eps64 = 1.0842021724855044E-19 =  = 2**(-63)
        eps64 = 2.2204460492503131E-16 =  = 2**(-52)
        eps32 = 1.19209290E-7 = 0.000000 =  = 2**(-23)

In each case, the empty value between the two final = signs comes from
calls like this one in the REAL case:

        WriteReal (StdChans.StdOutChan(), eps, 0)

It seems MUCH more useful to instead display a suitable default number
of digits when sigFigs = 0, such that round trip conversion from
binary to decimal to binary recovers the original binary value inside
the computer.  This topic was studied by David Matula and Bennett
Goldberg in the 1960s, in papers that you can find here:

[they look almost identical on the screen, except that the first is
for BibTeX, and the second is for browsers, and has live hyperlinks].

ACM has made journal archives from their first 50 years available for
free download, so anyone should be able to get the cited papers via
the DOI values in their BibTeX entries.

Their findings are different from what most people then thought, and
clearly continue to think.  In particular, these two hoc functions
return the required digit counts:

        ### matula(nbits): return number of decimal digits needed to ensure
        ### correct round-trip conversion between binary and decimal
        func matula(nbits) { return ceil(nbits/log2(10) + 1) }

        ### goldberg(ndecdig): return number of bits needed to ensure correct
        ### round-trip conversion between binary and ndecdig decimal digits
        func goldberg(ndecdig) { return ceil(ndecdig/log10(2) + 1) }

For the four IEEE 754 binary formats, here are the Matula counts:

        format          decimal digits
        binary32                 9
        binary64                17
        binary80                21
        binary128               36

On x86_64 systems then, gm2 should use replacement sigFigs values of
9, 17, and 21 for the SHORTREAL, REAL, and LONGREAL types.  Those
values can also be caps on the counts produced when larger sigFigs
values are requested by user code.

gm2's Write*Real() functions produce 17, 37, and 45 digits for those
formats, after which, trailing zeros are supplied.

I'm working on getting access to a working gm2 for systems that use
the IEEE 754 binary128 format for LONGREAL, but cannot yet report
results for those platforms.


(2) The output line for eps32 above shows a result of 0.000000 from
WriteShortReal (eps, 0); that would be more sensibly displayed in
exponential form to show that it is nonzero.


(3) When sigFigs = 0, all three test programs produce rubbish output
from WriteFloat():

        sigFigs =  0   eps32 = 
        sigFigs =  0   eps64 = 
        sigFigs =  0   eps80 = 

Once again, the Matula defaults would be far more sensible here.


It is not only gm2 that has such issues. Below is a snippet (in LaTeX)
from a book that I'm writing about a new math library that I'm
developing.  It describes how widespread is the ignorance of the
Matula and Goldberg formulas published more then 55 years ago in
possibly the most widely read journal in the field of computer

>> ...
>> Regrettably, output conversions without precision specifications in
>> both \X{Fortran} and \X{C} default to 6 fraction digits for
>> floating-point output, whereas Matula mandates 9 for the 32-bit IEEE
>> format, 10 for older 36-bit formats with 27-bit fractions, and 17 for
>> the 64-bit IEEE format.  Such defaults are likely to be relied on by
>> many users, who do not understand that subsequent input of those
>> truncated values will \emph{not} reproduce the numbers that were in
>> the computer, because they lose 3 or more trailing bits.
>> More recent programming languages are little better.  For the IEEE 754
>> 32-bit and 64-bit binary formats, without specifying an output
>> precision, \X{COBOL} produces 8 and 16 decimal digits, \X{C\#} outputs
>> 7 and 15, and \X{Go}, \X{Java}, and \X{Rust} output 8 and 17.
>> Some languages only provide the 64-bit format: the \X{Awk}, \X{PHP},
>> and \X{Python} scripting languages all output 6 digits, 
>> \X{Algol 68 Genie}, \X{LibreOffice}, and \X{Perl} output 15 digits,
>> \X{Matlab} and \X{Octave} output 5 and 16 in \code{short} and \code{long} 
>> formats,
>> \X{GNU} \X{Pascal} outputs 16 digits, and
>> \X{JavaScript} and \X{MetaPost} produce the Matula-mandated 17 digits.
>> Given the age of some programming languages,%
>> \footnote{\X{Fortran} appeared first in 1956, \X{COBOL} in 1959,
>>           \X{Algol 68-R} and \X{Pascal} in 1970, \X{C} in 1972,
>>           and \X{C++} in 1979.}
>> and given that several of them are defined by international standards
>> written by presumed experts in the languages and their practical
>> implementation, we have considerable evidence that correct round-trip
>> base conversion has been generally ignored, and the chosen defaults
>> are simply \emph{wrong}.  This author's \pgm{hoc} implementations,
>> however, all follow the Goldberg%
>> \index{Goldberg, Bennett}
>> and Matula%
>> \index{Matula, David [American mathematician and computer scientist]}%
>> \index{United States}
>> formulas.
>> ...

- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                                                          -
- Department of Mathematics, 110 LCB    Internet e-mail:  -
- 155 S 1400 E RM 233              -
- Salt Lake City, UT 84112-0090, USA    URL: -

reply via email to

[Prev in Thread] Current Thread [Next in Thread]