emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: decode-coding-string on invalid UTF-8 string isn't rejected


From: Kenichi Handa
Subject: Re: decode-coding-string on invalid UTF-8 string isn't rejected
Date: Wed, 12 Mar 2003 09:51:19 +0900 (JST)

In article <address@hidden>, Simon Josefsson <address@hidden> writes:
> I'm trying to use decode-coding-string to "guess" charsets, and
> noticed this:

> (decode-coding-string "r\xe4k" 'latin-1)
>  => "räk"
> (decode-coding-string "r\xe4k" 'utf-8)
>  => "r"

> Wouldn't it be more appropriate if it returned nil (like
> `decode-char') or "rk"?

I've just fixed it to return "r\xe4k", i.e., invalid 8-bit
bytes are decoded into eight-bit-control or
eight-bit-graphic characters as the other cases.   Please
try the latest CVS HEAD.

> Perhaps I'm looking in the wrong place though.  Is there a elisp
> function that takes a unibyte string and decodes it using whatever the
> default (process) coding system priorities may be?  I.e., for me that
> runs in a UTF-8 locale, first try decoding as utf-8, if it fails,
> continue with Latin-1, etc.

(decode-coding-string UNIBYTE_STRING 'undecided) should work
as your purpose.

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]