emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: local chars displayed as numbers


From: Stefan Monnier
Subject: Re: local chars displayed as numbers
Date: Fri, 22 Sep 2006 20:20:57 -0400
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

> I don't think it uncommon.  People migrate from Windows to GNU/Linux
> (or switch between both), people exchange files with Windows users,
> ... (and on Windows, it's quite common to insert `smart quotes' and
> other non-Latin-1 characters).

True, but in my experience plain-text files using windows-1252 are still
rather uncommon under GNU/Linux.  Of course, it depends on the specifics,
but adapting Emacs to the specific circumstance should be done via the
.emacs, I think.

> What is the benefit to treat it as raw-text instead of window-1252
> assuming that the file only contains characters from window-1252?  We
> are taking about a file (> 300000 chars of text) with mostly ASCII,
> some Latin-1 [ÄÖÜäöüß] (1.3%, probably typical for a German text), and
> 19 \202 characters (= 0.005%).

Obviously, in the case where the file is using window-1252 encoding, there's
no harm in Emacs using the windows-1252 encoding.  But what about the other
cases, e.g. if the file is just binary, or slightly incorrect utf-8, or ...?

> (I don't know if Emacs really checks the frequency of such characters
> to decide about the coding.)

It doesn't.  It only checks whether there's no invalid byte (or
byte-sequence).  In the case of windows-1252, there is no invalid byte, so
any binary junk will be treated as windows-1252.


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]