[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Binary recognition is to narrow [new suggestion]
From: |
Jason Hood |
Subject: |
Re: Binary recognition is to narrow [new suggestion] |
Date: |
Sat, 21 Nov 2009 16:33:53 +1000 |
User-agent: |
Thunderbird 2.0.0.23 (Windows/20090812) |
Erik Jonsson wrote:
Instead of counting characters over 127 the only test is that the first
511 bytes don't contain any of the controll characters 0-8, 14-31. No
normal textfile would contain these.
No normal source file, but if you want to generalise to
text files, 8 (backspace) and 27 (escape) could probably
occur (man files being a prime example).
One of the benefits is that this will correctly tag files in uni-code as
text as well. Since those control characters never appears in uni-code
either.
I guess you mean UTF8, since UTF16/32 would most likely have
a few 0s.
Jason.
Re: Binary recognition is to narrow., Hideki IWAMOTO, 2009/11/19