[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#12291: [rev 109796] wrong UTF-8 handling
From: |
Eli Zaretskii |
Subject: |
bug#12291: [rev 109796] wrong UTF-8 handling |
Date: |
Mon, 03 Sep 2012 05:40:09 +0300 |
> From: Kenichi Handa <handa@gnu.org>
> Cc: wl@gnu.org, 12291@debbugs.gnu.org, smithcu@gvsu.edu
> Date: Mon, 03 Sep 2012 09:59:22 +0900
>
> > We can either read them as raw bytes, or convert them to u+FFFD. The
> > former sounds like a more useful behavior to me, FWIW.
>
> What to convert to U+FFFD? Each byte, or the byte sequence?
The byte sequence.
> Anyway, we can't simply convert them to U+FFFD because it
> results in change of file contents just by reading and
> writing.
Yes, and that's why I prefer the raw-bytes way.
> I think converting each invalid byte to raw-byte is simpler
> and equally useful.
It's more useful, I think.