[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XML Blueberry
- From: David Brownell <email@example.com>
- To: John Cowan <firstname.lastname@example.org>,Vincent-Olivier Arsenault <email@example.com>
- Date: Thu, 21 Jun 2001 15:12:46 -0700
> > The backward-compatibility argument just doesn't hold : I'd be curious
> > to see how (or if) Java parsers (for instance) enforce the restricions
> > to UNICODE as specified in the XML spec. Aren't they just relying on the
> > Java platform to handle encoding?
Do you mean Appendix B conformance? Or decoding from whatever
binary form is used to represent the characters?
> Some parsers, at least, have their own tables.
Crimson augments some version of Unicode rules (java.lang.Character, as
specified in the Java language spec) with special cases as identified in the
XML 1.0 spec ... so if Java changes its level of Unicode conformance, that
parser's behavior will. It was conformant to Appendix B a while back.
AElfred2 uses java.lang.Character directly, and doesn't try to add all the
funky special cases. That suits its original "mostly correct, but simple"
goals, but leads to mild nonconformance (nobody's complained!) for
Appendix B rules about what can be name/namestart characters.
I don't know what Xerces does, but when I first looked it the character
processing was incomprehensible, also nonconformant. I understand
that the current versions are merely incomprehensible ... :)
p.s. Here's a radical thought. Rather than death by a thousand cuts,
why not just come out with a DTD-less XML? One big change,
not lots of small ones -- easier to manage such changes. Maybe
that's just kidding ... I'm not quite sure ...