octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #52681] Bad reading for UTF-8 characters with


From: Markus Mützel
Subject: [Octave-bug-tracker] [bug #52681] Bad reading for UTF-8 characters with fscanf()
Date: Sun, 17 Dec 2017 15:48:13 -0500 (EST)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0

Follow-up Comment #6, bug #52681 (project octave):

@andy: While using the "CHAR_MIN" and "CHAR_MAX" macros for the range check in
ov-re-mat.cc, ov-scalar.cc and co. would probably solve the problem at hand,
it would also lead to "strange" effects:
Something like "double(char(200))" would be 200 on systems that have char
running from 0 to 255 but would be 0 on others where the "default" char is
signed. Likewise, "double(char(-20))" would be 0 on some systems and 236 on
others.
But maybe I have missed your point.
Reading the answers to your question on stackoverflow, I think Octave's
charMatrix and charNDArray should probably be based on Array<unsigned char>
instead of Array<char>. But others are probably more aware of the
implications.

@Dan: Interesting find that "textread" seems to do the job. It looks like it
would solve the OP's problem. Does that function use a different code path?
But I think that the underlying problem why the functions from the scanf
family are failing should be checked nevertheless.
"unicode2native" and "native2unicode" are new in Octave 4.3.0+ and not yet
available in 4.2.1. Do you have suggestions how the documentation of these
functions could be improved?

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?52681>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]