OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   What are Schemas For?

[ Lists Home | Date Index | Thread Index ]

tblanchard@mac.com wrote:
>> This is both human readable and computer processable. There are a 
>> variety of similar techniques. Now what's the equivalent for TSV, 
>> PLists, etc.?
> Yeah, but (almost) nobody can understand them, you still have to write a 
> bunch of custom code to proceess them.  

It is not true that almost nobody can understand schemas/dtds. Sure, 
there are some complex corners of XSD but you can easily learn the 
_basics_ in a day, and thousands or tens of thousands of people have.

> ... I'd quit bringing up the schema 
> thing.  I don't see it used much at all (reading the list of "types" in 
> schema I can see why).  And what's the benefit of having the schema 
> machine readable?  Validation?  It only provides syntactic validation.  
> You still need to do your own semantic validation.
> Schema value == 0.

First, you just said that the schema does syntactic validation. So that 
means that the schema's value is greater than zero. It does something 
that otherwise _every implementor_ would have to do in code.

Second, the level of validation that is syntactic versus semantic will 
vary widely for the language. Once you say that the first thing in a 
section must be a title, what else is there to say? "A title is a short 
string that summarizes the section for future reference." On the other 
hand, if you put a statically typed programming language in XML, then 
the semantic validation would greatly outweigh the syntactic one. (which 
is one of several reasons that I wouldn't encourage that project)

Third, you are wrong that the schema does not enforce any semantics. I 
think that the fact that a datatype must be a floating point number 
between 1 and 100 is a semantic, not syntactic issue. And beyond that, 
there are schema languages _specifically designed_ for handling semantic 


Nevertheless, I acknowledge that there will probably always be some set 
of constraints that can only be expressed in prose and that subset 
(small or large, depending on your particular vocabulary) will have to 
be implemented in code. That doesn't remove the value that is provided 
by the subset that you _didn't_ have to implement in code.

Fourth, there are whole classes of products that depend heavily on the 
schema to drive their behaviour. Everything from XML word processors to 
data binding tools to forms tools to parts of MS Office 11.

Fifth, the schema is useful to people who are NOT implementors. When Big 
Vendor A sends syntactically wrong data to the program developed by 
Little Vendor B, there is a tendency for customers to blame Little 
Vendor B. The schema allows the little vendor to _easily demonstrate_ in 
a non-refutable that they conform to the standard. Not by getting into 
language wars over the meaning of potentially ambiguous prose in the 
specification, but by submitting the document and the schema to a 
neutral third party, the XML validator. Of course this won't always work 
and sometimes the Little Vendor will still lose the business, but there 
is nevertheless a quantitative shift in power when it becomes _easier_ 
to validate conformance to a standard without hunting for potentially 
ambiguous prose.

Sixth, the existence of schema languages (and XML in general) puts 
pressure on vendors to open up and document their file formats. What 
was, historically, a moral obligation becomes also a technical one.

> Its still a dump of internal data.  BTW, I see some ballyhooage about MS 
> using XML for Office.  You know what MSXML looks like?
> <data7>AB373947F879874983792283787AC5E</data7>

Your theory is at odds with the reports of people who have reviewed the 
product and also at odds with the published claims of Microsoft.

> Its just marked up chunks of base 64 binary.  XML isn't magic and 
> doesn't guarantee interop.  In fact, I'm quite sure MSXML will still 
> prevent it some how.

Just FYI, "MSXML" is the name of an XML toolkit implementation. And as 
far as I know, it is pretty standards compliant.

> Or not.  The preference is not.  And I showed you the pointlessness of 
> XMLizing them.  They did the XML to please the zealots.

Or perhaps for compatibility with hundreds of software tools?

>> I'll ask again: what's the schema language for plists?
> English.  Its all you need.

Maybe it is all _you_ need. But I've been using schemas (was: DTDs) for 
more than eight years so please trust me when I say that _I_ need them.

  Paul Prescod


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS