OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] heritage (was Re: [xml-dev] SGML on the Web)

[ Lists Home | Date Index | Thread Index ]


Jeni Tennison wrote:

>Hi Patrick,
>>Another advantage to our approach is that you don't need a new
>>syntax to make it work, benefits are available here, now, today.
>You say that you aren't providing a new syntax, but well-formed XML
>documents can't represent overlapping structures (unless you use
>empty-element or PI milestones of course). So in my view you *are*
>using another syntax -- a semi-formed XML in the examples that you've
>shown -- when you work with documents that you're interpreting as
>holding overlapping trees.
Yes, well-formed XML documents cannot contain overlapping structures. 
But as my examples show, we are producing well-formed XML that will 
validate with any parser. The XML markup that is presented to the parser 
is well-formed, so what is your complaint? There is no structure in the 
document until one is imposed on it. For further XML processing we 
impose a well-formed structure.

The syntax for markup that can be chosen to form part of an imposed 
structure is that of XML. The syntax of the output, should you desire 
XML, is well-formed XML syntax. Not sure when it is not valid XML?

Oh, you mean prior to processing! Well, it has no structure prior to 
processing does it? Until something determines what is or is not markup, 
and what rules that markup must follow, the document is not anything at 
all with regards to XML.

>As your README says in your example:
>TestMilton.xml  a sample input file showing overlapping analyze of a section
>                of Milton's 'Paradise Lost'. (Note file has .xml suffix even
>                though it is 'ill-formed' according to XML 1.0 spec.)
>                        http://www.sbl-site2.org/Extreme2002/JITTs.zip
>My point is, I guess, that different syntaxes support different kinds
>of structures, and XML *doesn't* support overlapping markup. If you
>change XML to make it support overlapping markup, then it isn't XML
>any more -- it's a new syntax that happens to look confusingly similar
>to XML.
Merely a warning to users who presently have non-JITTs parsers that 
adhere to the fixed tree model of XML. XML has not been changed at all, 
we have changed the  processing (which is where structures should have 
been imposed in the first place) of XML. One of the benefits of that 
change in processing that one can assert multiple and varying structures 
on a single text that has a familiar syntax.

I belabor this because it is very important: A JITTs parser can use 
standard XML syntax and do things that are simply not possible with a 
standard XML parser. The example I gave earlier today of the dictionary 
entry is only one example. JITTs is does not, has not and will not 
require a new syntax to produce benefits that current XML processes 
cannot produce.

>Of course that doesn't detract from the idea of using configurable
>parsers to interpret a true XML document in different ways, and I
>appreciate that you're just using an existing syntax to try out these
>ideas, but as an XML person I'd feel a lot more comfortable with your
>examples if you'd use well-formed XML, with milestones to represent
>the overlapping structures, in your examples, rather than a
I suspect the discomfort is due in part to the persistence of the idea 
that an XML document, or any other document for that matter, has some 
inherent structure. There is no structure until something in the 
document is interpreted as "markup" and that "markup" is subjected to a 
set of content models, and with XML, for its adherence to the rules for 

As I pointed out in our paper (and here) JITTs is not limited to 
overlapping hierarchies. It addresses a number of issues with current 
markup strategies.

We set out to solve one problem (overlap) and eventually arrived at a 
solution that appears to have a much broader applicability.

>(This thinking is why Wendell and I though it best to create a non-XML
>syntax for LMNL; I appreciate that a new syntax might be something
>that you want to avoid, but I think the only real alternative is to
>use milestones everywhere, which is very tedious to write and quite
>difficult to read.)
A third alternative is to change how one interprets markup for the 
purpose of imposing structures on a text. There is no natural law 
requirement that markup processing recognized all the markup in a 
document. Actually the XML 1.0 spec specifies a syntax for markup but it 
never says that all markup has to be recognized. It does have all the 
other restrictions that have been mentioned but it omits that one. So 
long as the markup presented to the parser meets all the stated 
requirements, it appears to be valid XML.


>Jeni Tennison
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>

Patrick Durusau
Director of Research and Development
Society of Biblical Literature


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS