help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [help-octave] Re: utf8 does not appear to work for function document


From: Alan W. Irwin
Subject: Re: [help-octave] Re: utf8 does not appear to work for function documentation strings generated with texinfo
Date: Wed, 26 Mar 2014 15:14:45 -0700 (PDT)
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)

On 2014-03-26 16:19-0400 Mike Miller wrote:

[...]If you try

 fwrite (fid, "The unicode character, ≥, is output\n", "schar");

[...] instead, then the values are not limited and it should work (both work 
for me).

That works here too.  In fact, if I use
the following patch

--- __makeinfo__.m_original     2014-03-26 13:56:42.741106684 -0700
+++ __makeinfo__.m      2014-03-26 13:56:19.005546479 -0700
@@ -120,7 +120,7 @@
     if (fid < 0)
       error ("__makeinfo__: could not create temporary file");
     endif
-    fwrite (fid, text);
+    fwrite (fid, text, "schar");
     fclose (fid);

     ## Take action depending on output type

then it solves the original issue (without the suggested
## @documentencoding UTF-8 line) I posted concerning utf8 help strings
for functions.  So that is how I have written up the bug report
at http://savannah.gnu.org/bugs/index.php?41965.

This may or may not be an acceptable workaround for the particular
case of help strings, but internally fwrite (and other functions) may
still effectively apply ASCII range limits to char matrices (strings)
until Octave actually supports wide characters.

My understanding is one of the principal points about utf8 is you
don't need to use wide characters.  Instead, a utf8 string should be
represented as an array of 8-bit bytes terminated by a null character.
Of course, if you convert utf8 to UCS4 (for example), then you will
need an array of wide characters to represent the latter.  That said,
it does appear from the success of "schar" for fwrite that Octave utf8
strings are stored internally as a vector of signed char's (as opposed
to unsigned char's) so the only way to keep invalid conversions of the
utf8 string from happening is to also use the "schar" type with fwrite
as in the above patch.

Thanks very much for discovering that "schar" possibility which I
think is likely the correct solution to the problem of allowing utf8 help
strings to be correctly processed by  __makeinfo__.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________



reply via email to

[Prev in Thread] Current Thread [Next in Thread]