emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ".*utf\\(-?8\\)\\>" versus ".*[._]utf" versus "address@hidden>"


From: Paul Eggert
Subject: Re: ".*utf\\(-?8\\)\\>" versus ".*[._]utf" versus "address@hidden>"
Date: Tue, 1 Jan 2002 10:30:31 -0800 (PST)

> From: Dave Love <address@hidden>
> Date: 01 Jan 2002 17:07:37 +0000
> 
>  > No, the preceding entry "address@hidden>" has a delimiter, and the other
>  > entries (e.g. ".*8859[-_]?1\\>") are special cases because ISO 8859
>  > locale names in practice could have almost anything before the
>  > '8859'.
> 
> I don't understand why utf-8 should be any different.

Because utf-8 should be the normal case.  In the normal case, the
encoding name should be delimited, to prevent incorrect matches when
one encoding name is a suffix of another.

>  > I've never seen a locale by that name, and I doubt whether we'll
>  > run into one.  Locale names like 'iso_8859_1' are still around for
>  > backward compatibility reasons, but modern locale names give more
>  > than just the character encoding.
> 
> Why are only modern names necessarily relevant (and only modern
> Unix-like systems)?  Emacs has long been documented to accept just
> that in the environment variables and at least some modern systems
> seem to be happy with it.

I'm not sure I follow your point, but I'll try to answer.  The code in
question is using a heuristic to guess the coding system from the
locale name.  All other things being equal, it's better to keep the
heuristic simple and easy to explain.  The heuristic I was trying to
use is:

  Emacs looks at the codeset part of the locale name (e.g. the "UTF-8"
  in "address@hidden"), except that there is a special case for
  old-fashioned 8859-style locale names like "iso_8859_1".

> I've seen/used considerable variations, so I aimed to be permissive
> like the existing cases.  Could this actually lose?

I don't know of any wins or losses in practice, but I think the more
aggressive match would make the documentation a bit more complicated.

This is a fairly minor issue; I wouldn't object much to the more
aggressive match.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]