Re: [changeset] Asian Characters and strchr()

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [changeset] Asian Characters and strchr()

From:	Ben Abbott
Subject:	Re: [changeset] Asian Characters and strchr()
Date:	Wed, 11 Mar 2009 20:06:46 +0800


On Mar 11, 2009, at 5:30 PM, Jaroslav Hajek wrote:

On Wed, Mar 11, 2009 at 9:14 AM, Ben Abbott <address@hidden> wrote:
On Mar 11, 2009, at 4:03 PM, Jaroslav Hajek wrote:
On Wed, Mar 11, 2009 at 8:53 AM, Ben Abbott <address@hidden>wrote:
On Mar 11, 2009, at 3:33 PM, Jaroslav Hajek wrote:
On Wed, Mar 11, 2009 at 8:24 AM, Ben Abbott <address@hidden>wrote:
I noticed that fileparts give an error when the full-filecontains
asian
characters.

ctave:209> fileparts ("System/Library/Fonts/junk.ttf")
error: subscript indices must be either positive integers orlogicals.
error: called from:
error:
/Users/bpabbott/Development/mercurial/octave-3.1.53/scripts/strings/strchr.m
at line 40, column 19
error:
/Users/bpabbott/Development/mercurial/octave-3.1.53/scripts/miscellaneous/fileparts.m
at line 30, column 10
It appears that there is a simple fix for strchr, but it willdepend
upon
the ascii equivalent for Asian fonts.

I'm seeing negative values.

fullfile = "System/Library/Fonts/junk.ttf";
octave:211> double(fullfile)
ans =

Columns 1 through 16:
83 121 115 116 101 109 47 76 105 98114 97
114   121    47    70

Columns 17 through 32:
111 110 116 115 47 -27 -115 -114 -26 -106-121 -25
-69  -122   -23   -69

Columns 33 through 37:

-111    46   116   116   102
Can anyone tell me what the permissible range for integervalues of
Asian
characters is?
I think a char->double conversion is supposed to yield nonnegative
values, so this seems buggy.
I'm planning to patch strchr, any reason I shouldn't do that?
I don't think there's a bug in strchr. This is clearly caused bythe
negative values.
For Asian fonts the values are 16bit ... unsigned or signed Idon't know.
No, they're not. See your own example. Octave has no support forUTF8
strings, so unless "char" is more than 8 bits, the result will be an
8-bit number. Thus, "strchr" won't search for Japanese characters(but
this does not mind here, since you need to find just an ascii
character).
Currently, the sign of char -> double is left up to the compiler,
which I don't think is good. I think we should guarantee that to be
positive, same what Matlab does. Shall I make a patch, or do youwish
to do it?
ok, I'm hadn't considered how many bits Octave was using forcharacters.
In any even, please to make a patch (I'm not competent enough in c++ to do
it myself).
In the meantime, I'll avoid using fileparts and strchr when theremay be
Asian characters present.

Ben
Fix is uploaded.

regards



It works as expected!

Thanks

Ben

[Prev in Thread]

Current Thread

[Next in Thread]

Asian Characters and strchr(), Ben Abbott, 2009/03/11
- Re: Asian Characters and strchr(), Jaroslav Hajek, 2009/03/11
  - Re: [changeset] Asian Characters and strchr(), Ben Abbott, 2009/03/11
    - Re: [changeset] Asian Characters and strchr(), Jaroslav Hajek, 2009/03/11
    - Re: [changeset] Asian Characters and strchr(), Ben Abbott, 2009/03/11
    - Re: [changeset] Asian Characters and strchr(), Jaroslav Hajek, 2009/03/11
    - Re: [changeset] Asian Characters and strchr(), Ben Abbott <=

Prev by Date: Re: about contibuting to octave
Next by Date: Re: missing pdf terminal; was --> [changeset] print.m (matlab compatibility)
Previous by thread: Re: [changeset] Asian Characters and strchr()
Next by thread: Re: missing pdf terminal; was --> [changeset] print.m (matlab compatibility)
Index(es):
- Date
- Thread