OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Validness, doctype, and Schema Instance

[ Lists Home | Date Index | Thread Index ]

> First of all, the XML specification, 
> http://www.w3.org/TR/2000/REC-xml-20001006, not only dictates the XML syntax, 
> but also includes the DTD language. Put it in another way; the spec is XML 
> syntax plus a schema language. Is that a correct interpretation/perspective?

Yes, it has both.

> For example, a validating XML parser has 
> nothing nothing to do with W3C XML Schema or any other schema language, but 
> if it checks against the DTD.

Right, of course the XML Schema specification defines new terms 
"schema-valid" and "schema-validity".

> My conclusions from this reasoning, which surely can be wrong, is that for 
> example a XHTML document which looks like this:
> <?xml version="1.0" encoding="UTF-8" ?>
> <html xmlns="http://www.w3.org/1999/xhtml";>
>    ...
> </html>
 > is _not_ valid XML, because it doesn't have a DOCTYPE thingy. Even 
the spec
 > says it: http://www.w3.org/TR/xhtml1/#docconf.

Right, there is no DTD so validity is unknown per se. Now, an 
application that uses an XML parser may modify the document by adding in 
the DOCTYPE (for example, SAX processors allow you to insert an external 
DTD subset).

> In other words, the XML specification ties the term "validness" to a 
> particular syntax, the DTD schema language. It's not an abstract term which 
> means "conforms to a model of the XML format", such that any schema language 
> can be used to "define" validness.

In a very technical and correct use of the term valid, yes. However, 
people use the term valid when referring to XML Schema validity and 
RELAXNG validity and so on.

> But does it matter if a document is Not valid? What if one walks out in this 
> world writing/generating documents which are not valid?

This is perfectly legal-- however there are a lot of situations where 
the validity of a document is assumed by a business process. For example 
you and a trading partner are passing documents back and forth-- that 
trading partner assumes your document is constructed according to an 
agreed on DTD-- validity insures this. Likewise, if you construct an 
invalid XHTML document there is a good chance that a browser won't 
understand it-- for example if you add a <dog/> element. Of course, most 
  browsers are pretty lenient.

> The DTD specifies the logical structure, but also identifies /what/ type of 
> document it is, e.g. the public identifier. But isn't XML namespaces for 
> this? When the XHTML namespace is declared in the xhtml element, wouldn't 
> that be enough?

This is not what namespaces are for-- they are only for distinguishing 
elements from different vocabularies used in the same document-- the 
public identifier applies to the whole document. Now, both Public Idents 
and Namespaces have been used to point to a catalog of schemata for 
validating (e.g., see RDDL)-- but that is another matter.

Jeff Rafter


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS