Single/Double precision NA values

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Single/Double precision NA values

From:	David Bateman
Subject:	Single/Double precision NA values
Date:	Mon, 28 Jul 2008 10:44:55 +0200
User-agent:	Thunderbird 2.0.0.12 (X11/20080306)

Dear All,

I got no response from the R maintainers to the attached e-mail that I
sent twice to their mailing list. I therefore assume that the R
maintainers don't really care too much about the compatibility of the NA
values between R and Octave, and that Octave must make its own choice
about how to handle this issue.

I previously sent a patch that addressed the issue of Single/Double
precision NA values

http://www.nabble.com/-Changeset--Re%3A-Single-Precision-versus-double-precision-NA-p17679409.html

that replaced the double NA value with one that can be converted from
double to single precision without change, but that is incompatible with
the value from R. The patch also included code such that saved files
that contained the old NA value would have this NA value changed
internally to the new representation.

John's comment on this was "What about always writing values in the old
format when writing binary data, and converting them when reading?".
Sure we can do this, but there are a few caveats

* Linking R and Octave libraries together in a single application
becomes complex as the Octave/R NA representation is different, though
is this really used that much?
* The conversion of the NA values will require either an addition copy
of the array to be written or a slow write function that checks every
value before writing to see if its a NA value.
* Only the Octave/Matlab file formats will be affected by this change,
and so HDF5 file formats for example won't be.
* Files saved with fwrite won't undergo this conversion either

So unless R reads the data from octave using an Octave or Matlab file
format then the NA value won't be preserved in any case. Adding the
conversion of NA values for the HDF5 and fwrite functions as well might
be done, but I'm concerned by the memory usage and speed implications of
this. So should we go this way, or just accept that the R/Octave NA
values are different?

Regards
David

-- 
David Bateman                                address@hidden
Motorola Labs - Paris                        +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 6 72 01 06 33 (Mob) 
91193 Gif-Sur-Yvette FRANCE                  +33 1 69 35 77 01 (Fax) 

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary

--- Begin Message --- Subject: Re: Issue with NA value and Octave compatibility Date: Thu, 26 Jun 2008 17:18:07 +0200 User-agent: Thunderbird 2.0.0.12 (X11/20080306)

ping

David Bateman wrote:
> Dear R developers,
>
> I'm an Octave developer in the process of implementing a single
> precision type in Octave and I have an issue with the NA value. The
> choice of NA value in Octave was made a few years back so that the high
> word of the NA value was 0x7ff00000 and the low word was 0x000007A2 for
> compatibility with R and to ease any possible issue with the exchange of
> data files between Octave and R.
>
> However, now that I'm in the process of implementing the single
> precision type I have a problem with this choice for the NA value as the
> above when cast to a float results in the loss of the 0x7A2 value
> creating a positive Infinity in IEE754, and so conversion of the NA
> values between double and float with the above value does not work.
>
> I have several possible choices of how to treat this, but as the reason
> for the choice of Octave's NA value was made for compatibility with R,
> the choice I'll make might very well be determined by how the R
> developers react to any changes that Octave makes in this direction.
>
> I can't realistically wrap the double and float types in Octave and
> overload the assignment operators to handle the assignment of a float to
> a double and visa versa as this would completely replace the underlying
> data types in Octave. Its also impossible to trap everywhere where a
> double might be assigned to a float and special case NA values as there
> are just too many places that might occur.
>
> I'm therefore assuming that I have to replace Octave's internal
> representation of the NA value to allow easy conversion of the NA value
> between double and floats. This can be done by replacing the NA value
> with one that has zeros in the lower 19 bits of the mantissa of the
> double, so that the cast from a double to float and visa versa works
> correctly. For example using 0x7FF840F4 and 0x40000000 for the low and
> high word of the double NA value. and 0x7FC207A2 for the float NA value
> works. However then I have an issue of exchange of NA values with R and
> with older versions of Octave.
>
> Its easy enough to check for old NA values in files when reading and
> alter them to the new NA values. So forward compatibility with older
> versions of Octave and from R to Octave would be ok. However, the
> reverse is not true. Actually saving the with the old NA value is
> equally possible and would allow full compatible with older versions of
> Octave and with R. The downside is that there are many places I'd have
> to make a copy of the data when saving to allow this (for example saving
> to HDF files), and so I'd prefer not to have to do this if possible.
>
> As backwards compatibility is the smaller concern and self correcting
> with time, if R was to also accept the an additional possible NA value
> such as 0x7FF840F4/0x40000000, at least when loading files then
> compatibility of the NA value between Octave and R could be maintained
> and I wouldn't have to pay the penalty of making a copy of the data to
> treat the NA values. Would the R developers be willing to make such a
> change in R? If not I will maintain the R compatible NA value in
> Octave's output and pay the performance penalty within Octave for this.
>
> In any case if R intends at some point to support a single precision
> type you will come across the same issue.
>
> Regards
> David
>
>   


-- 
David Bateman                                address@hidden
Motorola Labs - Paris                        +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 6 72 01 06 33 (Mob) 
91193 Gif-Sur-Yvette FRANCE                  +33 1 69 35 77 01 (Fax) 

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary

--- End Message ---

[Prev in Thread]

Current Thread

[Next in Thread]

Single/Double precision NA values, David Bateman <=
- Re: Single/Double precision NA values, Olaf Till, 2008/07/28
  - Re: Single/Double precision NA values, David Bateman, 2008/07/28
    - Re: Single/Double precision NA values, Olaf Till, 2008/07/28
- Single/Double precision NA values, John W. Eaton, 2008/07/29
  - Re: Single/Double precision NA values, Søren Hauberg, 2008/07/29
  - Re: Single/Double precision NA values, David Bateman, 2008/07/29
    - Re: Single/Double precision NA values, John W. Eaton, 2008/07/29
    - Re: Single/Double precision NA values, dbateman, 2008/07/29
    - Re: Single/Double precision NA values, John W. Eaton, 2008/07/29
    - Re: Single/Double precision NA values, David Bateman, 2008/07/30

Prev by Date: Re: Graphics: Title and label properties
Next by Date: Re: Single/Double precision NA values
Previous by thread: Removing control, finance and quaternion toolboxes
Next by thread: Re: Single/Double precision NA values
Index(es):
- Date
- Thread