Re: eight-bit char handling in emacs-unicode

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: eight-bit char handling in emacs-unicode

From:	Kenichi Handa
Subject:	Re: eight-bit char handling in emacs-unicode
Date:	Fri, 21 Nov 2003 08:41:06 +0900 (JST)
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Juri Linkov <address@hidden> writes:
> (progn
>  (set-language-environment 'ukrainian)
>  (re-search-forward "[\000-\007\013\015-\032\034-\037\200-\237]" nil t))

> It fails with the (invalid-regexp "Invalid range end").
> Could you suggest how to fix this bug?

The current Emacs simply makes the unibyte regex string to
multibyte, and in Uktranian, as nonascii-translation-table
converts ?\200 to 299040, but ?\237 to 2295, the above
regexp leads to "Invalid range end".  This behaviour itself
is a bug.  We must treat \200-\237 as the same way as
\200\201...\236\237 (emacs-unicode already does that).

But fixing that bug doesn't solve the Gnus problem because
the intention of the part "\200-\237" is apparently to match
with C1 control chars, not to match with the multibyte
equivalence in the current language environment.  So
changing the above as below is correct.

(re-search-forward
  (string-as-multibyte "[\000-\007\013\015-\032\034-\037\200-\237]" nil t))

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: eight-bit char handling in emacs-unicode, (continued)

Prev by Date: Re: Suggestion: remove s/sco[45].h etc.
Next by Date: Re: Suggestion: remove s/sco[45].h etc.
Previous by thread: Re: eight-bit char handling in emacs-unicode
Next by thread: Re: eight-bit char handling in emacs-unicode
Index(es):
- Date
- Thread