[
Lists Home |
Date Index |
Thread Index
]
John,
John Cowan wrote:
>Elliotte Rusty Harold scripsit:
>
>>Does this have any use for XML? Is there any point to letting the
>>root shift from one node to another while still keeping everything in
>>the tree?
>>
>
>It may be my lack of imagination, but I don't see it. That trick is
>primarily important where the tree is more or less arbitrary, just
>used to provide quick key-based access to what is really a plain old
>sequence. XML trees in almost all cases have semantics of their own.
>
No, it is not lack of imagination but the static tree of SGML/XML that
is causing the "blind spot." ;-)
It is not merely shifting the root from one node to another but
"recognizing" only an asserted tree that brings the benefits.
Briefly because I have a rather longish report to complete:
<entry><headWord>JITTs</headWord>
(typical OED entry back to early Sumerian usage)
</entry>
Now, you want to build a DOM tree. With the standard XML tree, you get
everything between the <entry></entry> as nodes in the DOM tree. Correct?
As an alternative, for a "lite" searching interface to a dictionary, you
only want: <entry><headWord>JITTs</headWord>(blob of unparsed PCDATA,
which includes all the markup you got as nodes in the DOM tree in the
first one)</entry>
When I find the word I want, in this case JITTs, that block is returned
but this time, a tree is asserted for all the markup in the "blob," and
processed for presentation.
JITTs also handles asserting differing trees for other purposes but that
is one that is in the paper. (http://www.sbl-site2.org/Extreme2002)
It looks like we will be issuing another implementation tomorrow (as a
Perl/SAX filter) and I hope to post a better explanation of JITTs with
examples by later in the month. Jeni and Gavin have been debating JITTs
and LMNL on the LMNL list and both have helped focus my explanations a
little better.
The "tree" you see is the one you assert about the document instance.
The trick is not making markup bear the burden of being a member of a
static tree. It may or may not be a member of the tree you assert. If it
is not a member, with the filter implementations, it is simply
discarded. The tree you assert consists of the markup and PCDATA you
assert that it contains.
We talk about trees a lot with JITTs but the underlying principal is not
about trees, it is about asserting structures based upon markup in the
text. Since XML uses and expects a tree structure, it would make little
sense to extract some other structure for XML. Another advantage to our
approach is that you don't need a new syntax to make it work, benefits
are available here, now, today. (Well, as soon as we get some robust
implementations, which we are working on and hopefully we will see
better ones from others.)
More tomorrow!
Patrick
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu
|