Re: Getting UTF-8 value of string occasionally fails

discuss-gnustep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Getting UTF-8 value of string occasionally fails

From:	Pete French
Subject:	Re: Getting UTF-8 value of string occasionally fails
Date:	Wed, 13 Oct 2004 13:14:18 +0100

> Should U+D800 return a valid UTF-8 string? Is this a bug in base or
> correct behavior?

No, it shold not - unpaired surrogates are not allowed. We are using UTF16
internally, as the original OpenStep spec was written before the characters
outside the 16 bit range were definted (so a unichar is 16 bits). But UUTF-8
is not allowed to encode the surrogates individually - they are formed back
into a single character which is then converted.

> If it's my fault, suggestions for a workaround would be appreciated.

You need to maake sure that the string contains a single codepoint. If
you are talking codepoints outside the basic 16 bit charater set then that
means your string needs to contain two words to equate to a single codepoint.

-bat.

[Prev in Thread]

Current Thread

[Next in Thread]

Getting UTF-8 value of string occasionally fails, Christopher Culver, 2004/10/13
- Re: Getting UTF-8 value of string occasionally fails, Pete French <=
- Re: Getting UTF-8 value of string occasionally fails, Alexander Malmberg, 2004/10/13
- -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/13
  - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Alexander Malmberg, 2004/10/13
    - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Stefan Urbanek, 2004/10/13
    - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
    - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), stefan, 2004/10/14
    - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
    - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), David Ayers, 2004/10/14
  - Re: -gui Uncaught Exception Handler (Was: Getting UTF-8 value of string occasionally fails), Matt Rice, 2004/10/13

Prev by Date: Getting UTF-8 value of string occasionally fails
Next by Date: Re: Getting UTF-8 value of string occasionally fails
Previous by thread: Getting UTF-8 value of string occasionally fails
Next by thread: Re: Getting UTF-8 value of string occasionally fails
Index(es):
- Date
- Thread