emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: utf-16 not auto-detected when finding file


From: Jason Rumney
Subject: Re: utf-16 not auto-detected when finding file
Date: Wed, 30 Mar 2005 10:21:59 +0100
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)

Kenichi Handa wrote:
In article <address@hidden>, Jason Rumney <address@hidden> writes:

  
Dave Love <address@hidden> writes:
    
 Yes.  Perhaps someone knows exactly what Windows does (assuming the
 only significant use of it is in Windows)?
      

  
I would guess that the presence of a BOM is sufficient
heuristics. Detecting 0 or other low byte values every second
byte would work for Latin script based languages, but I don't think
any heuristic like that would work on Asian text unless you could
assume a specific language and use a dictionary.
    

I think BOM is not that safe because there are many charsets
who have normal letters at 0xFE and 0xFF.
  
But what are those characters, and are they likely to appear as a pair at the beginning of the file, and nowhere else?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]