RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes

classpath-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes

From:	Jeroen Frijters
Subject:	RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes
Date:	Wed, 17 Nov 2004 16:05:50 +0100

Archie Cobbs wrote:
> Jeroen Frijters wrote:
> > I committed the attached patch to remove the throwing of
> > CharConversionException from the character encoders/decoders.
> > 
> > For encoders, unsupported characters are now always 
> replaced with a '?'
> > byte and for the UTF8 decoder, invalid UTF-8 bytes are replaced by a
> > Unicode REPLACEMENT CHARACTER (\uFFFD) in the output stream.
> 
> Just curious.. does this implementation have the same problem as
> described in 
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4628881 ?
> I.e., is it a lossy encoding for "invalid" characters?

At the moment the UTF-8 encoder/decoder is fully symmetrical for all
"characters" (really UTF-16 codepoints), but this is actually a bug, IMO
unpaired surrogate pairs shouldn't be decoded (like the bug parade
comment says, the test case is bogus).

Regards,
Jeroen

[Prev in Thread]

Current Thread

[Next in Thread]

[cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Jeroen Frijters, 2004/11/17
- Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Archie Cobbs, 2004/11/17
- RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Jeroen Frijters <=
  - Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Archie Cobbs, 2004/11/17
- RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Jeroen Frijters, 2004/11/18
  - Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Archie Cobbs, 2004/11/18
- RE: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Jeroen Frijters, 2004/11/18
  - Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes, Archie Cobbs, 2004/11/18

Prev by Date: Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes
Next by Date: [cp-patches] FYI: Patch: DecoderUTF8.java fix to previous commit
Previous by thread: Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes
Next by thread: Re: [cp-patches] FYI: Patch: character encoder/decoder cleanup/fixes
Index(es):
- Date
- Thread