[
Lists Home |
Date Index |
Thread Index
]
Simon St.Laurent wrote:
>As for InkML, I'm happy to work with systems that don't mark every
>single atom up - that's why I did all that work on Regular
>Fragmentations. I'm not happy to see committees creating opaque new
>syntaxes in the context of what's supposedly an XML project. That seems
>bizarre, whatever committee-think justifications you develop to justify
>such behavior.
>
Rather than saying "InkML has poor markup", perhaps the question we
should be asking
is "In what kind of circumstances is terseness important?" Understand
the constraints and
we can judge the tradeoffs.
Let us imagine four constraints:
1) InkML must be text
2) InkML must be terse (faster parsing, less space)
3) InkML must be embeddable as part of an XML document
4) InkML objects must be annotatable and extendible using XML (or
XML-ish) mechanisms
Given those four constraints, I don't know what other reasonable choice
exists apart from
that kind of design they have. (This is not to dispute that it is not
rich markup; I am suggesting
that sometimes impoverished, minimal markup is appropriate.)
People who think that all structured information in XML documents should
be represented
by XML are living in a fantasy world (I know Simon is not one of them,
no flames please).
No-one would say that URLs should be marked-up as individual elements.
XML markup is metadata usable for generic kinds of processing: by the
time you get to
domain-specific processing by terminal applications (e.g. the use of a
URL, a list of
coordinates, a point size spec "3pt") the utility of generic-processing
largely disappears.
(In fact, I think that having specialist syntaxes for leaf siblings is
highly idiomatic, e.g.
<div style="background: #ffffff; size:10pt;">...
and should be encouraged and enabled rather than discouraged. The idea
that "attributes
are not structured" misses this idiom: attributes often contain
structured information but when their
structure is only of interest to specific end-point applications no-one
sheds tears that
the structure is hidden from generic XML processing--in fact this
probably simplifies
processing: that some structure can be unavailable as XML information
items imposes
a separation of concerns.)
Sometime soon we will have standards for parsing data content into
typeable subelements:
that will make life more straight forward for this kind of
graphical-object-description
language. Until then, we have to make up our own embedded little
languages sometimes.
XML is not the end of markup: the ability to demarc and name something
(<head>Diego Garcia</head>)
naturally leads to how can we locate that thing ("Fetch me the head of
Diego Garcia") to
questions of ontology ("what is a head?") and modeling ("the contents of
the head element
a name") and localization ("the name is in Christian-name Surname form")
and inexorably we want
to be able to repeat this.
It is a similar issue, in a sense, to the so-called embedded markup that
Norm Walsh and Tim Bray
have been excited about (see XML.COM). Personally, I think it is not as
clear-cut as they
suggest: why is
<div><![CDATA[
<p>blah</p>
]]></div>
kosher but
<div><![CDATA[
<p>blah
]]></div>
not kosher? (No comments about cutting the ends off ps please.)
And what about
<div><![CDATA[
<p>blah]]><![CDATA[</p>
]]></div>
The answer does not come from science but from craft: XML was designed to
provide relief and certainty from HTML's hacked syntax, not to
perpetuate it.
That people try to do this embedding is a sign that well-formedness may
be a
little too high a bar. I would distinguish this kind of embedded
notation from
the InkML one because the InkML one is more like the "coherent leaf
siblings in a
special notation" pattern I mention above, while the embedded HTML is
more an example of the "what can I hack together to make this look pretty"
anti-pattern.
Cheers
Rick
|