[
Lists Home |
Date Index |
Thread Index
]
[Mike Champion
> So what if XML were "refactored" so that the bare-bones well-formed syntax
> (and/or data model, that's another issue!) were the common core, and DTD
> processing were at the next layer up?
I agree that this would be a good approach to refactoring XML and it is
hinted at in the XML 1.0 Rec by its use of "standalone". Whatever would be
the point of defining it except to allow a profile for a stand-alone
document. It is not much of a step to get to a category of parser labeled
as "standalone".
In fact, most people probably follow this route when they create little
embedded parsers, for example, in javascript before IE and Mozilla had
accessible parsing capabilities. The progression tends to go something like
this, I think -
1) Parse angle brackets only. Enforce basic (i.e., non-dtd, non-encoding
issues) well-formedness. Use whatever character encoding the platform gives
you. Ignore DTDs. Maybe handle character references.
2) Handle comments. Handle character references if not already handled.
3) Pick up (non-parameter) entities in the internal subset. Normalize
attribute whitespace if not already handled. Maybe handle default values.
Maybe handle PIs.
4) Discover encoding problems with new source of input data, agonize over
it, finally break down and try to handle encodings with whatever built-in
capabilities the language supplies.
The progression probably is driven by trying to use the processor for more
and more input sources because it is convenient to use.
Refactoring Mike's way would fit into this kind of progression nicely.
Cheers,
Tom P
|