bug-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: patch to gnustep-base (Unicode and others)


From: Serg Stoyan
Subject: Re: patch to gnustep-base (Unicode and others)
Date: Mon, 8 Apr 2002 01:15:37 +0300
User-agent: Mutt/1.3.16i

Hello, Richard Frith-Macdonald.

 RFM> > Here is a patch to the gnustep-base, whith additions such as:
 RFM> > - fixes NSString's initWithCString* methods behaviour by commenting out
 RFM> >   GSString's. Without it initWithCString* methods doesn't convert C 
 RFM> >   string into Unicode and this is not OpenStep compliant;
 RFM> 
 RFM> Perhaps you can explain more ... as far as I cn see the above is simply
 RFM> wrong.  Certainly initWithCString* methods are not supposed to convert to
 RFM> unicode (as a general rule), and OpenStep doesn't say they should - so
 RFM> I'm guessing you have some meaning in mind that is not immediately
 RFM> obvious to me.

  Here is the citation from "OpenStep Specification" (c) 1994 NeXT Computer 
  Inc. Class NSString, page 2-127:
  "- (id)initWithCString:(const char *)byteString
  
  Initializes the receiver, a newly allocated NSString, by converting the
  one-byte characters in byteString into Unicode characters. byteString must
  be a null-terminated C string in the default C string encoding."

  But now it works right. Thank you. ;)
 
 RFM> > - adds 2 languages into Resources/Languages: Russian and Ukrainian;
 RFM> 
 RFM> Thanks, but I can't use them ... as I don't know what encoding you have 
 RFM> created them in.  I have added a README file to the Resources/Languages 
 RFM> subdirectory to say what format language files *should* be in (and 
 RFM> corrected some errors in the existing files).

  It's ok. I've just updated from CVS and created this files by cvtenc'ing 
  them, just like README says. But... When i start any app i get this
  message:
  
  File NSDictionary.m: 458. In [GSDictionary -initWithContentsOfFile:] Contents 
of file '/home/stoyan/GNUstep/System/Libraries/Resources/Languages/Russian' 
does not contain a dictionary

  Here is my some environment vars:
  
  [stoyan@localhost]$ echo $GNUSTEP_STRING_ENCODING; echo $LANG
  NSKOI8RStringEncoding
  ru_RU.KOI8-R

  I've attached Russian and UkraineRussian(conforming to Locale.aliases) 
  files as well.
    
 RFM> > - enables NSDictionary's initWithContentsOfFile read the language files
 RFM> >   which contains non-latin characters (e.g. Russian and Ukrainian) by 
 RFM> >   using default C string encoding;
 RFM> 
 RFM> The language files are property lists, and as such should not contain any
 RFM> non-ascii characters (they should use \u escape sequence for unicode) ...
 RFM> so this change should not be necessary.
 RFM> 
 RFM> That being said, I've been thinking about reverting the property list 
 RFM> loading methods in NSDictionary and NSArray to accepting non-ascii data.
 RFM> I'm really not sure what the best approach is.
 RFM> 
 RFM> On the one hand, it's convenient for people to be able to hand-edit 
 RFM> property lists using their local encoding,  On the other hand, if we 
 RFM> allow that, they will produce property lists which are non-portable.
 RFM> 
 RFM> I recently added a little tool 'cvtenc' to convert files from one 
 RFM> encoding to another. So I *think* the best thing is probably to stick 
 RFM> to enforcing portability of property lists, and use that tool (possibly 
 RFM> with further improvements) before/after hand editing them.  Certainly 
 RFM> we need to do that for GNUstep resources ... since we
 RFM> need them to be portable - but I remain uncertain about the best 
 RFM> approach for general users.

  I guess we can use 2 types of language files -- plain text property list, 
  with encoding in its file name and non-printable unicode file. For example,
  in case of russian:

  Languages/Russian.KOI8-R         <-- plain proplist in KOI8-R encoding
  Languages/Russian.WindowsCP1251  <-- plain proplist in Windows 1251 encoding
  Languages/Russian                <-- Unicode file, created with 'cvtenc'

  In this case we use Unicode file, and proplist files remains for editors.
  Or we can use proplist files with appropriate encoding scheme, if we have
  to use it(no unicode file for some reason).

PS: Another thing i've mentioned (and i guess should be somwhere in 
Documentation) is about using non-ascii characters when initializing NSString
variable. I mean using such definition:

NSString  *some_string = @"some non-ascii characters";

is deprecated. In this case string doesn't not converted into Unicode and 
results is unpredictable, or something.

-- 
Serg Stoyan

Attachment: Languages.tgz
Description: application/tar-gz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]