[
Lists Home |
Date Index |
Thread Index
]
- From: keshlam@us.ibm.com
- To: xml-dev@lists.xml.org
- Date: Thu, 10 Aug 2000 09:22:44 -0400
I think I'm going to drop out of this thread again, as it's rehashing
ground that has been VERY thoroughly covered in the past...
Namespaces are NAME spaces. That's all. Folks read all sorts of other
implications into namespaces, and everyone has their own ideas about what
they want to do with namespaces, but namespaces themselves (a) are just
naming and (b) make no particular effort to be compatable with DTD
validation. There was a fairly explicit assumption that if you were using
Namespaces, you would either work with well-formed documents or with a
namespace-aware schema language, _NOT_ with DTDs. If you insist on mixing
the two, and it doesn't work well, the response is going to be "we know."
Namespaces don't do anything to make documents which mix seperately defined
tagsets easier to validate. All they do is improve your ability to tell
which set a given tag was intended to belong to. You can use them to tell
whether Gettysburgh Address is a speech, a point in memory, or a place you
can send mail to... but determining which of these makes sense at any given
point in your document is not Namespaces' problem.
DTDs are not aware of namespaces. They look only at the QName. Thus
namespaces gain you nothing with respect to DTD-based validation, and in
fact their declaration that the QName is not the "real" name of the element
makes DTDs fairly explicitly the wrong tool for Namespaced documents. You
can force-fit these, but in general it Really Isn't worth the effort. (I've
done so as a stopgap, but I expect to discard those inadequate DTDs as
rapidly as possible in favor of namespace-aware schemas.)
Remember that DTD/Schema validation is OPTIONAL, and was never intended to
be a complete solution. If you really intend to arbitrarily intermix
elements from multiple tagsets, the answer may be that none of these
content modelling languages is adequate to capture all the logic describing
what's permitted where and when. In that case the right answer is to stick
with well-formed documents, and move the "validation" logic back into your
application.
If the tagsets have a clear boundary between them -- in other words, if you
aren't attempting to permit completely arbitrary interleaving of languages
-- there's also the solution of using ANY in selected places and having
your application logic explicitly check those to apply more intelligence at
those specific points.
Other validation schemes are certainly possible; that's a large part of
what the XML Schema effort is all about (and Relax, and others that have
been proposed). In some of those, it _is_ possible to do more specific
things than ANY while still allowing some kinds of language intermixing.
But the more specific your validation constraints are, the less
flexibility you have... and the more flexibility you have, the less
meaningful validation is. That's probably unavoidable given that these are
validating strictly against the syntax of the document, and don't
understand its semantics. Again, if you need semantic awareness, that has
to be done in your application.
Basically, the "publisher's gumbo" model of saying "I want to be able to
drop MathML takes anywhere inside an SVG graphic embedded anywhere within
an HTML document -- or vice versa" just doesn't fit the concept of
Validation in the first place. If you really want the parser to provide
some validation assistance, you need to give it a specific grammar for the
document. If you can't describe that grammar precisely, validation can't do
anything for you.
Yes, XML Schema leaves some things out, entities being the most-cited case.
I don't know that I have an opinion on this, though I consider the current
handling of entities a bit inconsistant. The workaround is to use DTD
syntax to declare your entities, then run schema validation against the
resulting Infoset. (Note that you _can_ validate against both DTD and
schema, though I presume that will be rare.) Whether that workaround
_should_ be necessary is something to take up with the Schema working
group, though it's getting rather late to do so.
______________________________________
Joe Kesselman / IBM Research
|