OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Wikipedia on XML

On Mon, 10 Aug 2009 03:20:27 +0100, Michael Kay wrote:
> There may be millions of markup languages that allow <a/> as a document, or
> there may be none. It doesn't matter: <a/> is a well-formed XML document
> regardless. I think this stuff about XML being used to define other markup
> languages is a very confusing way of explaining things to newcomers. The
> first thing to get across is that <a/> is (a document allowed by the rules
> of) XML.


It's probably not worthwhile to draw the distinction, in the article, 
between "well-formedness" and "validity", but it's probably important 
to have it in mind while writing the article.

When people describe XML as a meta-language, or a syntax or language 
for defining markup languages, they are, in effect, talking about 
validity: documents are valid or invalid according to some schema (or 
similar mechanism).

XML is still XML without a schema, and can be characterized without a 
schema (or DTD).  XML has rules of well-formedness.  It's true that 
these rules do not establish the tag names, but they do establish:

that the XML declaration must be the first thing in the file (or 
stream) (FSVO "first", but it's a fairly rigorous V)

that only whitespace, comments, and processing instructions may appear 
in the prologue (and epilogue?  I don't think XML actually defines the 
epilogue, as MIME does)

that there is a single root element for a document (but the restriction 
is relaxed for an XML entity)

that every element must be closed (tags are paired or empty)

that elements may not overlap

that attribute content must be quoted

with regard to namespaces, that every namespace prefix must be declared 
"before" use (FSVO "before")

There are a fair number of XML dialects out there that are not easily 
represented in any common schema or specification (ant comes to mind, 
particularly, especially since it lends itself to extension).  Even 
though you cannot validate these dialects, they are *clearly* XML.

So ... I'd lean in the direction of *rejecting* the argument that XML 
is a complex beastie that provides tools for defining markup 
languages.  That's certainly true, but it's equally true that there are 
markup languages that are clearly *XML* without much formal 
definition.  In fact, there are probably quite a lot of "little" 
languages (for configuration and the like) that are almost entirely 
undocumented (and which default to "mustignore" semantics, for the most 

You can verify well-formedness without any knowledge of what's in the 
document, without knowing anything about any particular elements or 
attributes.  You have named elements, named attributes; these have 
standard syntax.  The XML spec doesn't specify what any of them are; 
it's extensible that way.  You have comments and processing 
instructions; these have standard syntax.  Again, there's no definition 
of what's *in* them; that's an extension point.

It's an extensible markup language.  Keep it simple.

Amelia A. Lewis                    amyzing {at} talsever.com
How do you make a cat go moo?
Ask it: "Does a dog have the Buddha-nature?"

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS