[
Lists Home |
Date Index |
Thread Index
]
At 7:21 PM -0800 1/21/04, Tim Bray wrote:
>1. Any output encoding other than UTF-8
>2. Optional escaping of illegal characters
>3. Prettyprinting support
>4. Various kinds of error workarounds, and turning errorchecking off
>5. Writing CDATA sections
>6. Writing XML declaration (good idea, but I want the output to be
>Canonical XML)
I think you're at best at 50% with these goals, maybe less. This is
certainly not at 80%.
My experience with JDOM, XOM, and other APIs for doing operations
like XInclude that ultimately reslt in a serialized XML document is
that users really, really want pretty-printing a lot of the time. If
the API wont do that, they will ignore it. I wouldn't pretty print by
default, but I would definitely include options for setting the
maximum line length and indent string.
Furthermore, I would make canonicalization an option if it's included
at all. It imposes a significant performance hit since you have to
sort the attributes and namespaces. Worse, it prevents full streaming
since the attributes and namespaces have to be buffered before you
sort them. But more importantly, probably half the time users don't
care about canonicalization. The other half the time they do care. To
be more specific they don't want it. It's too damn ugly and the lines
are too long to work with. Plus there's no XML declaration, which
users like.
Similarly, users actively desire non-UTF-8 encodings. UTF-16,
Latin-1, SJIS, etc. are all much easier for some classes of users to
process with non-XML aware tools than is UTF-8 on today's systems.
I think an XML output library needs to realize that opening files in
a plain text editor is still a very important use case a lot of the
time. Byte-by-byte comparison and digital signatures usually aren't
(and when they are the digital signature library will canonicalize
first). Requiring canonical XML and not pretty printing or allowing
encoding selection does not give users the simple library that does
what they need. It is not sufficiently full featured to hit the 80-20
point.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|