help-smalltalk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-smalltalk] Re: [bug] UnicodeString conversion truncation


From: Robin Redeker
Subject: Re: [Help-smalltalk] Re: [bug] UnicodeString conversion truncation
Date: Mon, 22 Oct 2007 10:12:53 +0200
User-agent: Mutt/1.5.11+cvs20060403

On Sun, Oct 21, 2007 at 08:51:24AM -0700, Paolo Bonzini wrote:
> Issue status update for 
> http://smalltalk.gnu.org/project/issue/108
> Post a follow up: 
> http://smalltalk.gnu.org/project/comments/add/108
> 
> Project:      GNU Smalltalk
> Version:      <none>
> Component:    Base classes
> Category:     bug reports
> Priority:     normal
> Assigned to:  Unassigned
> Reported by:  elmex
> Updated by:   bonzinip
> -Status:       active
> +Status:       fixed
> Attachment:   http://smalltalk.gnu.org/files/issues/gst-iconv-more.patch 
> (3.71 KB)
> 
> You opened a can of half a dozen different off-by-one and similar bugs. 
> :-)  All fixed in the attached patch

I tested the patch and everything seems to work now.
But I've found this code in json.st which puzzled me a bit:

   String>>#jsonPrintOn:
      (self anySatisfy: [ :ch | ch value between: 128 and: 255 ])
             ifTrue: [ self asUnicodeString jsonPrintOn: aStream ]
             ifFalse: [ super jsonPrintOn: aStream ]

Why print strings that have non-ascii chars differently?
And this in the string parsing code:

            c = $u
               ifTrue: [
        c := (Integer readFrom: (stream next: 4) readStream radix: 16) 
asCharacter.
        (c class == UnicodeCharacter and: [ str species == String ])
          ifTrue: [ str := (UnicodeString new writeStream
               nextPutAll: str contents; yourself) ] ].
         ].
      str nextPut: c.

Maybe I don't understand the Unicode implementation of GNU smalltalk not enough.

Would you object if I change the json code to operate on UnicodeStrings only?

Stricly and semantically the JSON implementation should only operate on 
UnicodeStrings
as JSON is only parseable in Unicode. (I wonder what happens with the current 
JSON reader when it encounters a utf-16 encoded String, as far as my test went, 
it just didn't
work because it doesn't expect multibyte encodings in String).

What puzzles me is the question what JSONReader>>#nextJSONString should
return. Should it be a String or a UnicodeString?

If it returns UnicodeString no literal string access on a Dictionary returned by
the JSON parser will work as it would get only a String object which has a 
different
hash function than UnicodeString.


Robin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]