Lists Home |
Date Index |
> First of all, the XML specification,
> http://www.w3.org/TR/2000/REC-xml-20001006, not only dictates the XML syntax,
> but also includes the DTD language. Put it in another way; the spec is XML
> syntax plus a schema language. Is that a correct interpretation/perspective?
Yes, it has both.
> For example, a validating XML parser has
> nothing nothing to do with W3C XML Schema or any other schema language, but
> if it checks against the DTD.
Right, of course the XML Schema specification defines new terms
"schema-valid" and "schema-validity".
> My conclusions from this reasoning, which surely can be wrong, is that for
> example a XHTML document which looks like this:
> <?xml version="1.0" encoding="UTF-8" ?>
> <html xmlns="http://www.w3.org/1999/xhtml">
> is _not_ valid XML, because it doesn't have a DOCTYPE thingy. Even
> says it: http://www.w3.org/TR/xhtml1/#docconf.
Right, there is no DTD so validity is unknown per se. Now, an
application that uses an XML parser may modify the document by adding in
the DOCTYPE (for example, SAX processors allow you to insert an external
> In other words, the XML specification ties the term "validness" to a
> particular syntax, the DTD schema language. It's not an abstract term which
> means "conforms to a model of the XML format", such that any schema language
> can be used to "define" validness.
In a very technical and correct use of the term valid, yes. However,
people use the term valid when referring to XML Schema validity and
RELAXNG validity and so on.
> But does it matter if a document is Not valid? What if one walks out in this
> world writing/generating documents which are not valid?
This is perfectly legal-- however there are a lot of situations where
the validity of a document is assumed by a business process. For example
you and a trading partner are passing documents back and forth-- that
trading partner assumes your document is constructed according to an
agreed on DTD-- validity insures this. Likewise, if you construct an
invalid XHTML document there is a good chance that a browser won't
understand it-- for example if you add a <dog/> element. Of course, most
browsers are pretty lenient.
> The DTD specifies the logical structure, but also identifies /what/ type of
> document it is, e.g. the public identifier. But isn't XML namespaces for
> this? When the XHTML namespace is declared in the xhtml element, wouldn't
> that be enough?
This is not what namespaces are for-- they are only for distinguishing
elements from different vocabularies used in the same document-- the
public identifier applies to the whole document. Now, both Public Idents
and Namespaces have been used to point to a catalog of schemata for
validating (e.g., see RDDL)-- but that is another matter.