Lists Home |
Date Index |
> On 8/24/05, Bullard, Claude L (Len) <firstname.lastname@example.org> wrote:
>> Abstractly, nodes is nodes and properties is properties,
>> and we understand it, but it seems to confuse people
>> who can't resist seeing element types as class
>> declarations minus the methods. That of course sends
>> the entity/attribute camp members up the tree, and
>> so much for kumbayah.
> My point was more or less "XML is just labeled data". What meaning
> you as a human, your simple code, your bleeding edge OWL-inferencer,
> or your Sci-Fi Strong AI impute to those labels is not XML's business.
> However high you stack these turtles, the only thing at the bottom of
> the stack is the primitive notion of a "label".
And the current XML stack fails in the label may not even apply to the data
content, for example of subelements of mixed content!
<p>It was a dark and <!-- JACK: CONTINUE EDITING FROM HERE-->
<?typesetter newpage?><x>stormy night</x><y>fine night</y>.</p>
We are used to the idea that the comment's connection to <p>
is for a human to know, and that the PIs connection to the <p>
is for some processor to know, but we don't have any way of saying
whether the <x>'s data is part of the <p> or not: if <x> were
<html:b> we might say yes, if <x> were <footnote> we might
So XML gives labelled ranges of text which nest, but perhaps except
for leaf elements and child data content, it does not give any
warrant that the label applies to all the text between the start
and end tag! The most obvious
symptom of the lack of support for rhetorical structures is that
non-validated XML cannot even tell us which whitespace nodes are
significant (the notorious DOM doubler!)
Perhaps allowing xml:space=preserve|default|strip|collapse|trim
would improve that particular problem, but the general one would