[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Getting UTF-8 value of string occasionally fails
From: |
Pete French |
Subject: |
Re: Getting UTF-8 value of string occasionally fails |
Date: |
Wed, 13 Oct 2004 13:14:18 +0100 |
> Should U+D800 return a valid UTF-8 string? Is this a bug in base or
> correct behavior?
No, it shold not - unpaired surrogates are not allowed. We are using UTF16
internally, as the original OpenStep spec was written before the characters
outside the 16 bit range were definted (so a unichar is 16 bits). But UUTF-8
is not allowed to encode the surrogates individually - they are formed back
into a single character which is then converted.
> If it's my fault, suggestions for a workaround would be appreciated.
You need to maake sure that the string contains a single codepoint. If
you are talking codepoints outside the basic 16 bit charater set then that
means your string needs to contain two words to equate to a single codepoint.
-bat.
- Getting UTF-8 value of string occasionally fails, Christopher Culver, 2004/10/13
- Re: Getting UTF-8 value of string occasionally fails,
Pete French <=
- Re: Getting UTF-8 value of string occasionally fails, Alexander Malmberg, 2004/10/13
- -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/13
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Alexander Malmberg, 2004/10/13
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Stefan Urbanek, 2004/10/13
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), stefan, 2004/10/14
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
- Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Matt Rice, 2004/10/13