help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: url-retrieve and utf-8


From: Andreas Röhler
Subject: Re: url-retrieve and utf-8
Date: Mon, 4 Feb 2008 17:02:27 +0100
User-agent: KMail/1.9.5

Am Montag, 4. Februar 2008 13:43 schrieb William Xu:
> William Xu <william.xwl@gmail.com> writes:
> > At present, I tried to call:
> >
> >   (decode-coding-string (buffer-string) 'utf-8)
> >
> > But the result is only partially correct. For example, when there are a
> > mix of ascii and japanese characters, it only returns the ascii part.
>
> For this, it is because I have called (skip-chars-backward
> "[[:space:]]") before decode-coding-string, and apprarently
> skip-chars-backward seems mistook some non-ascii characters as
> whitespaces.


AFAIS that's not a mistake, that's implemented

See elisp info node 34.3.1.2 Character Classes


`[:space:]'
     This matches any character that has whitespace syntax (*note

....


Here is a table of syntax classes, the characters that stand for them,
their meanings, and examples of their use.

 -- Syntax class: whitespace character
     "Whitespace characters" (designated by ` ' or `-') separate
     symbols and words from each other.  Typically, whitespace
     characters have no other syntactic significance, and multiple
     whitespace characters are syntactically equivalent to a single
     one.  

======> Space, tab, newline and formfeed <============

are classified as
     whitespace in almost all major modes.

;;;;;;;

[:blank:] should DTRT.

Andreas Röhler




reply via email to

[Prev in Thread] Current Thread [Next in Thread]