[
Lists Home |
Date Index |
Thread Index
]
Mike Champion <mc@xegesis.org> wrote:
| I don't think that the scenario in which all parsers anyone will
| encounter support DTDs is something we can simply assume. SOAP, and
| possibly XML 2.0 "core" don't/won't support anything specified in a DTD,
| so parsers optimized for a DTD-less profile will probably become more
| prevalent in the future.
The distaste for DTDs seems to be a politically correct kneejerk reaction.
Quite apart from real and imagined flaws in their information content, I
sense a prejudice that processing DTDs is inherently "difficult" (and thus
something it might be useful to, um, "optimize away".) Can anyone offer a
substantive reference explaining in detail the problems in processing
DTDs, or is this yet another "truth" never to be examined? [Disclaimer:
you may find considerable sympathy if you bash parameter entities, but
only for the right reasons.]
It's worth considering what a socalled "DTD-less profile" involves.
1. No entity references.
While at it, the 5 predefined ones should be abolished too. IOW, poof the
syntactic device altogether. As far as the XML world is concerned, I
don't think this is a loss (until, of course, the device gets reinvented,
which, again of course, it will - but hey, it won't be NIH, it'll be new
New NEW!, and so it'll be k00l - see XInclude for the shape of things to
come.)
It isn't a loss because the average XML-wonk has no idea what entities and
references to them are for (as entity declarations are alien to experience
grounded in the tag soup happiness of Netploder.) Never learned about
'em, so never used 'em, so who needs 'em? Don't say sorry, be happy.
2. No data content notations.
Again no loss: the average XML wonk would know know a notation if one bit
him. Must be some old-fogey SGML weirdness, poof it.
3. No default values for attributes.
This isn't so bad. Goal 10 of the XML spec institutionalizes verbosity,
in which can be included routine denormalization of everything in sight
(if something needs repeating everywhere, then repeat it everywhere - if
that bothers you, then write a program to do it for you.)
4. No ID declared value.
With the probable exception of entity references (see #1 above) this was,
AFAIR, the main reason to require that even non-validating parsers process
the internal subset - SGML syntax didn't have a purely syntactic means to
identify the ID attribute, and XML 1.0 wasn't about to invent one (though
it could have.)
Losing IDs may have implications for XPointer, but since there's no way
now to identify IDREF attributes either, this too is no loss.
So, the news is uniformly good for implementors. Go for it!
|