[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Wikipedia on XML
- From: Amelia A Lewis <amyzing@talsever.com>
- To: Michael Kay <mike@saxonica.com>
- Date: Mon, 10 Aug 2009 00:15:06 -0400
On Mon, 10 Aug 2009 03:20:27 +0100, Michael Kay wrote:
> There may be millions of markup languages that allow <a/> as a document, or
> there may be none. It doesn't matter: <a/> is a well-formed XML document
> regardless. I think this stuff about XML being used to define other markup
> languages is a very confusing way of explaining things to newcomers. The
> first thing to get across is that <a/> is (a document allowed by the rules
> of) XML.
Yes.
It's probably not worthwhile to draw the distinction, in the article,
between "well-formedness" and "validity", but it's probably important
to have it in mind while writing the article.
When people describe XML as a meta-language, or a syntax or language
for defining markup languages, they are, in effect, talking about
validity: documents are valid or invalid according to some schema (or
similar mechanism).
XML is still XML without a schema, and can be characterized without a
schema (or DTD). XML has rules of well-formedness. It's true that
these rules do not establish the tag names, but they do establish:
that the XML declaration must be the first thing in the file (or
stream) (FSVO "first", but it's a fairly rigorous V)
that only whitespace, comments, and processing instructions may appear
in the prologue (and epilogue? I don't think XML actually defines the
epilogue, as MIME does)
that there is a single root element for a document (but the restriction
is relaxed for an XML entity)
that every element must be closed (tags are paired or empty)
that elements may not overlap
that attribute content must be quoted
with regard to namespaces, that every namespace prefix must be declared
"before" use (FSVO "before")
There are a fair number of XML dialects out there that are not easily
represented in any common schema or specification (ant comes to mind,
particularly, especially since it lends itself to extension). Even
though you cannot validate these dialects, they are *clearly* XML.
So ... I'd lean in the direction of *rejecting* the argument that XML
is a complex beastie that provides tools for defining markup
languages. That's certainly true, but it's equally true that there are
markup languages that are clearly *XML* without much formal
definition. In fact, there are probably quite a lot of "little"
languages (for configuration and the like) that are almost entirely
undocumented (and which default to "mustignore" semantics, for the most
part).
You can verify well-formedness without any knowledge of what's in the
document, without knowing anything about any particular elements or
attributes. You have named elements, named attributes; these have
standard syntax. The XML spec doesn't specify what any of them are;
it's extensible that way. You have comments and processing
instructions; these have standard syntax. Again, there's no definition
of what's *in* them; that's an extension point.
It's an extensible markup language. Keep it simple.
Amy!
--
Amelia A. Lewis amyzing {at} talsever.com
How do you make a cat go moo?
Ask it: "Does a dog have the Buddha-nature?"
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]