[
Lists Home |
Date Index |
Thread Index
]
Jeni,
Jeni Tennison wrote:
>Hi Patrick,
>
>>Another advantage to our approach is that you don't need a new
>>syntax to make it work, benefits are available here, now, today.
>>
>
>You say that you aren't providing a new syntax, but well-formed XML
>documents can't represent overlapping structures (unless you use
>empty-element or PI milestones of course). So in my view you *are*
>using another syntax -- a semi-formed XML in the examples that you've
>shown -- when you work with documents that you're interpreting as
>holding overlapping trees.
>
Yes, well-formed XML documents cannot contain overlapping structures.
But as my examples show, we are producing well-formed XML that will
validate with any parser. The XML markup that is presented to the parser
is well-formed, so what is your complaint? There is no structure in the
document until one is imposed on it. For further XML processing we
impose a well-formed structure.
The syntax for markup that can be chosen to form part of an imposed
structure is that of XML. The syntax of the output, should you desire
XML, is well-formed XML syntax. Not sure when it is not valid XML?
Oh, you mean prior to processing! Well, it has no structure prior to
processing does it? Until something determines what is or is not markup,
and what rules that markup must follow, the document is not anything at
all with regards to XML.
>
>As your README says in your example:
>
>---
>TestMilton.xml a sample input file showing overlapping analyze of a section
> of Milton's 'Paradise Lost'. (Note file has .xml suffix even
> though it is 'ill-formed' according to XML 1.0 spec.)
>---
> http://www.sbl-site2.org/Extreme2002/JITTs.zip
>
>My point is, I guess, that different syntaxes support different kinds
>of structures, and XML *doesn't* support overlapping markup. If you
>change XML to make it support overlapping markup, then it isn't XML
>any more -- it's a new syntax that happens to look confusingly similar
>to XML.
>
Merely a warning to users who presently have non-JITTs parsers that
adhere to the fixed tree model of XML. XML has not been changed at all,
we have changed the processing (which is where structures should have
been imposed in the first place) of XML. One of the benefits of that
change in processing that one can assert multiple and varying structures
on a single text that has a familiar syntax.
I belabor this because it is very important: A JITTs parser can use
standard XML syntax and do things that are simply not possible with a
standard XML parser. The example I gave earlier today of the dictionary
entry is only one example. JITTs is does not, has not and will not
require a new syntax to produce benefits that current XML processes
cannot produce.
>
>Of course that doesn't detract from the idea of using configurable
>parsers to interpret a true XML document in different ways, and I
>appreciate that you're just using an existing syntax to try out these
>ideas, but as an XML person I'd feel a lot more comfortable with your
>examples if you'd use well-formed XML, with milestones to represent
>the overlapping structures, in your examples, rather than a
>pseudo-XML.
>
I suspect the discomfort is due in part to the persistence of the idea
that an XML document, or any other document for that matter, has some
inherent structure. There is no structure until something in the
document is interpreted as "markup" and that "markup" is subjected to a
set of content models, and with XML, for its adherence to the rules for
well-formedness.
As I pointed out in our paper (and here) JITTs is not limited to
overlapping hierarchies. It addresses a number of issues with current
markup strategies.
We set out to solve one problem (overlap) and eventually arrived at a
solution that appears to have a much broader applicability.
>
>(This thinking is why Wendell and I though it best to create a non-XML
>syntax for LMNL; I appreciate that a new syntax might be something
>that you want to avoid, but I think the only real alternative is to
>use milestones everywhere, which is very tedious to write and quite
>difficult to read.)
>
A third alternative is to change how one interprets markup for the
purpose of imposing structures on a text. There is no natural law
requirement that markup processing recognized all the markup in a
document. Actually the XML 1.0 spec specifies a syntax for markup but it
never says that all markup has to be recognized. It does have all the
other restrictions that have been mentioned but it omits that one. So
long as the markup presented to the parser meets all the stated
requirements, it appears to be valid XML.
Patrick
>
>Cheers,
>
>Jeni
>
>---
>Jeni Tennison
>http://www.jenitennison.com/
>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>
>
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu
|