[
Lists Home |
Date Index |
Thread Index
]
Arjun Ray wrote:
> The distaste for DTDs seems to be a politically correct kneejerk reaction.
> Quite apart from real and imagined flaws in their information content, I
> sense a prejudice that processing DTDs is inherently "difficult" (and thus
> something it might be useful to, um, "optimize away".)
Whether it is difficult or not, it takes processing time and other
ressources. If you have a throughput of several hundred SOAP requests
per second, you'll be glad to optimize away everything that could
be optimized away, in particular if you have to do some stuff a
validator does but even more thorough, like checking that your
DB key isn't only there but also fits your format restrictions, and
if you know that some validation tests have to be checked for but
are actually never triggered. In fact, really high performance web
services toolkits don't even use general purpose parsers, they
compile customized parsers from the service descriptions.
Another interesting case are XML pipelines, for example plumbing
a FO processor to an XSLT processor (rather common), or Cocoon's
pipelines. DTDs don't fit in there well, at least as long it is
universally assumed that validation is an integral part of the
XML parser.
> you may find considerable sympathy if you bash parameter entities, but
> only for the right reasons.]
That's interesting. How do you write large, complex DTDs without
parameter entities and keep them maintainable? Disclaimer: this
doesn't mean I like parameter entities. XSchema has types for the
purposes I have to use parameter entities in DTDs. More disclaimer:
I don't like XSchema very well either, but it solves some problems
I had with DTDs.
> 1. No entity references.
Good! Should be enacted immediately! (with some amendments regarding
character references)
> While at it, the 5 predefined ones should be abolished too. IOW, poof the
> syntactic device altogether. As far as the XML world is concerned, I
> don't think this is a loss (until, of course, the device gets reinvented,
> which, again of course, it will - but hey, it won't be NIH, it'll be new
> New NEW!, and so it'll be k00l - see XInclude for the shape of things to
> come.)
XInclude is not the same as an entity reference
http://www.w3.org/TR/xinclude/#rel-extent
And it fits into a DTD-less world, of course :-)
From an XSLT perspective, XInclude is much better than entities.
Sometimes you don't want to have the references expanded, either
because you want to process them yourself, or because you want to
pass them downstream. With XInclude, and supposed you can tell the
upstream XML processor (the parser) to leave the XIncludes alone,
the XSLT processor sees just another XML element, and it can be
processed like any other. In case of entities, you face some
difficulties, because there are not "entity nodes" or something in
the (logical) XML data model. XPath and XSLT would have to be
extended to cope with this, and the XQuery people would probably
find "entity nodes" annoying.
> 2. No data content notations.
>
> Again no loss: the average XML wonk would know know a notation if one bit
> him. Must be some old-fogey SGML weirdness, poof it.
The rest of the WWW uses MIME types. Why forcing everybody to
declare their content types itself, possibly in an incompatible
way?
> 3. No default values for attributes.
>
> This isn't so bad. Goal 10 of the XML spec institutionalizes verbosity,
> in which can be included routine denormalization of everything in sight
> (if something needs repeating everywhere, then repeat it everywhere - if
> that bothers you, then write a program to do it for you.)
Hmm. XSLFO seems to have defaults for their properties (which are
syntactically expressed as attributes). Interestingly, I've never
heard someone complaining that there should be a FO DTD with the
proper defaults coded in (which is impossible to do for all cases
anyway, because of some rather quirky dependencies). And nobody
writes *every* possible attribute to a FO document, rather they
seem to rely on the defaults built into the FO processor. Same for
the SOAP/webservices world: the application takes care of it.
> 4. No ID declared value.
>
> With the probable exception of entity references (see #1 above) this was,
> AFAIR, the main reason to require that even non-validating parsers process
> the internal subset - SGML syntax didn't have a purely syntactic means to
> identify the ID attribute, and XML 1.0 wasn't about to invent one (though
> it could have.)
>
> Losing IDs may have implications for XPointer, but since there's no way
> now to identify IDREF attributes either, this too is no loss.
When I started with XML I thought IDs are great. Some years
later and with routine processes involving merging and
splitting XML documents, IDs have lost their appeal. I need
names which are unique over a *set* of documents. IDs are not
sufficient for this. And XSLT has xsl:key/key(), which is
much more versatile than ID/id() anyway. Nobody in the XSLT
community would be hurt if IDs are dropped, many wouldn't
even notice. This doesn't mean that xsl:key doesn't have
deficiencies. Also, I regret that XSLT and XSchema have
slightly different views on what's a key and what it's good
for. But then, life has always been difficult...
> So, the news is uniformly good for implementors. Go for it!
Yes! Yes! YES! :-)
J.Pietschmann
|