[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] SSAX, utf8 and the byte order mark
From: |
John Cowan |
Subject: |
Re: [Chicken-users] SSAX, utf8 and the byte order mark |
Date: |
Fri, 5 Jan 2007 15:54:03 -0500 |
User-agent: |
Mutt/1.3.28i |
Charles Breathe scripsit:
> In the process of trying to write a script in Chicken I attempted
> to use the SSAX XML->SXML function with a stream that begins with a
> UTF-8 byte order mark. Unfortunately the function dies when it reads
> the BOM. Currently I'm converting it to a stream and then filtering
> out the offending characters, but that seems terribly ugly. Is there a
> better approach? Is this something that the XML->SXML function should
> be handling itself?
[putting on XML Core Working Group hat]
This is an XML->SXML issue. For a long time it wasn't too clear whether
UTF-8-encoded XML documents were allowed to contain a BOM or not.
(UTF-16-encoded documents are required to do so.) As a result of an
erratum, the XML Recommendation now requires that UTF-8 BOMs be accepted
and ignored.
--
Even the best of friends cannot John Cowan
attend each others' funeral. address@hidden
--Kehlog Albran, The Profit http://www.ccil.org/~cowan