[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cp-patches] gnu/xml/transform/StreamSerializer.java: compatibilityM
From: |
Ito Kazumitsu |
Subject: |
Re: [cp-patches] gnu/xml/transform/StreamSerializer.java: compatibilityMode setting |
Date: |
12 Mar 2005 22:56:27 -0000 |
User-agent: |
SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) Emacs/21.3.50 (i386-unknown-freebsd5.3) MULE/5.0 (SAKAKI) |
In message "Re: [cp-patches] gnu/xml/transform/StreamSerializer.java:
compatibilityMode setting"
on 05/02/13, Chris Burdess <address@hidden> writes:
:> Unfortunately your patch is almost guaranteed to produce
:> non-well-formed XML.
OK, I do not insist on my patch, and I do not use the patched program
myself now: I use UTF-8 and "iconv -f UTF-8 -t EUC-JP".
From a practical viewpoint of mine, whether the produced XML is valid
is less important than whether it is compact and human-readable.
When I am handling a Japanese text, I can assume that only Japanese and
ASCII characters appear in it.
I understand that a commonly used system like GNU Classpath cannot take
this practical viewpoint and must take the safest choice.
:> I agree that compatibilityMode is a hack. What's really needed is a way
:> to detect whether a character is a valid member of a given encoding,
As for CJK characters, I cannot imagine such a way of testing a character
without having a table of all valid characters.
I used to use Saxon as an XSLT processor, and this is what Saxon does:
Saxon itself does not support character encodings other than those standard
ones as UTF-8 or ISO-8859-1, and relies on java.nio.charsets package to
handle general character encodings. In addition to that, Saxon provides
a API with which a user can write his own character set handler which
tells whether a character is a valid member of a given encoding.
In order to satisfy my needs, I wrote my own Japanese character handler
which tells a lie that all Unicode characters are Japanese characters,
just like I set the compatibilityMode for gnu/xml/transform/StreamSerializer
to true.
I think this is a good idea. Saxon can be free from the risk of
producing invalid XML documents and responsible users can do
anything they like.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [cp-patches] gnu/xml/transform/StreamSerializer.java: compatibilityMode setting,
Ito Kazumitsu <=