[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: local chars displayed as numbers
From: |
Kenichi Handa |
Subject: |
Re: local chars displayed as numbers |
Date: |
Sat, 23 Sep 2006 15:29:29 +0900 |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Stefan Monnier <address@hidden> writes:
> > I don't think it uncommon. People migrate from Windows to GNU/Linux
> > (or switch between both), people exchange files with Windows users,
> > ... (and on Windows, it's quite common to insert `smart quotes' and
> > other non-Latin-1 characters).
> True, but in my experience plain-text files using windows-1252 are still
> rather uncommon under GNU/Linux. Of course, it depends on the specifics,
> but adapting Emacs to the specific circumstance should be done via the
> .emacs, I think.
> > What is the benefit to treat it as raw-text instead of window-1252
> > assuming that the file only contains characters from window-1252? We
> > are taking about a file (> 300000 chars of text) with mostly ASCII,
> > some Latin-1 [ÄÖÜäöüß] (1.3%, probably typical for a German text), and
> > 19 \202 characters (= 0.005%).
> Obviously, in the case where the file is using window-1252 encoding, there's
> no harm in Emacs using the windows-1252 encoding. But what about the other
> cases, e.g. if the file is just binary, or slightly incorrect utf-8, or ...?
At least windows-1252 doesn't cover all eight-bit bytes.
There are a few invalid bytes: 0x81, 0x8c, 0x8e...
Anyway, how about thinking the situation this way.
When one visits a binary file and it's detected as
windows-1252, usually he can easily notice that the
auto-detection did bad thing because a binary file tend to
contain many 8-bit bytes in the first page. So, he can
re-read the file by C-x C-m c binary RET C-x C-v RET. But,
when one visits a windows-1252 file and it's read as
raw-text, it's more difficult to notice that the file is not
correctly decoded because it may not contain a raw-byte in
the first page. In this case, he'll notice the problem only
after he did some editing, and that is too late to re-read
the file.
Stefan Monnier <address@hidden> writes:
> So I'd rather have a tool that explains what's going on, so that the user
> can decide to use window-1252 if it's a good choice for her, rather than
> force windows-1252 on all users most of whom won't ever edit a file with
> window-1252 encoding.
How about indicating a binary buffer in more outstanding
way, for instance, changing the mode line color and show
"BINARY FILE" in the mode line?
---
Kenichi Handa
address@hidden
- Re: local chars displayed as numbers, (continued)
- Re: local chars displayed as numbers, Andreas Roehler, 2006/09/20
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/20
- Re: local chars displayed as numbers, Kenichi Handa, 2006/09/20
- Re: local chars displayed as numbers, Stefan Monnier, 2006/09/22
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/22
- Re: local chars displayed as numbers, Stefan Monnier, 2006/09/22
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/22
- Re: local chars displayed as numbers, Stefan Monnier, 2006/09/22
- Re: local chars displayed as numbers,
Kenichi Handa <=
- Re: local chars displayed as numbers, Jason Rumney, 2006/09/23
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/23
- Re: local chars displayed as numbers, Eli Zaretskii, 2006/09/23
- Re: local chars displayed as numbers, Richard Stallman, 2006/09/23
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/23
- Re: local chars displayed as numbers, Eli Zaretskii, 2006/09/23
- Re: local chars displayed as numbers, Reiner Steib, 2006/09/23
- Re: local chars displayed as numbers, Eli Zaretskii, 2006/09/23
- Re: local chars displayed as numbers, Jason Rumney, 2006/09/22