[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [h-e-w] Determining coding system for text files
From: |
Eli Zaretskii |
Subject: |
Re: [h-e-w] Determining coding system for text files |
Date: |
Mon, 31 Oct 2011 22:36:49 +0200 |
> Date: Mon, 31 Oct 2011 15:28:22 -0400
> From: Eric Roode <address@hidden>
>
> I would like new buffers to default to utf-8 encoding, and I would like
> indeterminate files (like text files, especially source code files) also to
> use utf-8, unless the -*- line specifies a different coding system.
What for? What you ask for doesn't make sense without some
explanation. It is meaningless to say that pure ASCII files should
have UTF-8 encoding, because UTF-8 is indistinguishable from ASCII
when all the characters are 7-bit ASCII.
> By default, when I create a new buffer that isn't associated with any file,
> the coding system is set to 'iso-latin1-dos'. When I visit an existing
> (text) file, its coding system is set to 'undecided-dos'.
>
> I tried to change this by executing
> (prefer-coding-system 'utf-8)
> After that, when I create a new file, the coding system in the new buffer
> is set to 'utf-8'.
As expected. I presume this accomplishes part of what you wanted.
> However, when I open an existing file, emacs still sets its coding
> system to 'undecided-dos'.
If the file includes only 7-bit ASCII, this is also expected
behavior. Please explain why you aren't happy with this.
> Digging further, it seems that this is controlled by the variable
> file-coding-system-alist. If a file name does not match any of the
> patterns in that list, the function find-buffer-file-type-coding-system (in
> dos-w32.el) is invoked to determine what coding system to use for the file.
>
> That function *always* returns 'undecided' for text files, or
> 'no-conversion' for files it determines are binary. The only time it uses
> the default value for buffer-file-coding-system is if the file doesn't yet
> exist!
That is how Emacs behaves on all platforms, even on Unix. The default
value of buffer-file-coding-system is used only for non-existing files
or for buffers not related to files. When a buffer visits an existing
file, Emacs always sets its encoding to match the encoding of the
file. prefer-coding-system just tells Emacs which encoding to prefer
when more than one can match the encoding of an existing file.
> Am I reading this right? There is no way to set a preferred coding system
> for existing files under Windows?
There _is_ a way, but it doesn't do what you expect. Please explain
why your expectations are different, and in particular what is wrong
with the current behavior in your use cases.
> 'prefer-coding-system' only works in *nix environments?
No, it works the same on all platforms. The Windows implementation
has a few subtle points, but it doesn't change the basic behavior.
> I have to either add every source and text file name
> pattern to file-coding-system-alist, or manually change the buffer coding
> every time I visit an existing file?
You don't and you shouldn't. It is meaningless to change buffer
coding after you visit a file, except when you save the file. After
telling Emacs to prefer UTF-8, as you did, whenever Emacs needs to
encode a file when you save it, it will use UTF-8 if possible, and if
not, it will ask you for a different encoding. Again, if this is not
what you want, please explain why.