[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
coding tags and utf-16
From: |
Werner LEMBERG |
Subject: |
coding tags and utf-16 |
Date: |
Wed, 21 Dec 2005 09:00:33 +0100 (CET) |
There is a serious problem with coding tags and utf-16 encodings of
any flavour: Emacs simply can't recognize the tag. This is a
non-trivial problem. Right now I'm working on a groff preprocessor
which tries to handle this. I'm doing the following to find the tag
in an encoding-independent way:
. Check whether the file starts with the BOM (Byte Order Mark) --
this is one of the following byte sequences:
UTF-8: 0xEFBBBF
UTF-16: 0xFEFF or 0xFFFE
Skip it.
. Ignore zero bytes while looking for the -*- coding: ... -*-
stuff.
This heuristic algorithm might not give correct results in all cases
but it should be sufficiently reliable for normal use.
Werner
- coding tags and utf-16,
Werner LEMBERG <=