OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp

[ Lists Home | Date Index | Thread Index ]

> dareo@microsoft.com (Dare Obasanjo) writes:
> >In fact, i'd go as far as to say that applications that 
> >depend on full lexical round tripping (i.e. preserving all whitespace, 
> >treating attribute order as significant, etc) are in violation of the 
> >spirit if not the letter of the XML 1.0 recommendation. 
> Given that you have divined a data model for XML that isn't actually
> specified as the XML data model anywhere, I'm pretty wary of your
> interpretations of the spirit of XML 1.0.
> The XML 1.0 recommendation does say: "Note that the order of attribute
> specifications in a start-tag or empty-element tag is not significant."
> It does not say "parsers must discard the order and scramble the
> attributes."

Hear hear.

> >However I agree there are edge cases such as XML editors where such 
> >behavior is not just desirable but required. 
> There seems to have been a movement early on, especially with the DOM,
> to chop out "editor-only" functionality.  I'm not sure that was such a
> brilliant move in retrospect.

Ha.  Who needs retrospect.  As I recall, the danger here was clearly discussed 
even while DOM Level 1 was in progress.  The argument that allowed these 
issues to be set aside was always "we'll deal with it in another DOM level".  
Of course that never happened.

This whole thread has been a sort of möbius strip for me.  I find the idea of 
regexen for general and primary XML processing apalling, yet I agree with many 
regexen boosters that using XML APIs such as SAX and DOM too often mangles XML 
documents at the lexical level.  I find the idea of "The XML Data Model"(TM) a 
laughable fiction and yet I feel it is vey important for the most common tool 
sets to operate on XML models such as SAX and XPath which do omit lexical 

I must say that this thread does give me very interesting ideas about 
next-generation XML processing models in Python, so I gues I'm grateful.

Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use internal references in XML vocabularies - http://www-106.ibm.com/developerworks/xml/library/x-tipvocab.html
Universal Business Language (UBL) - http://www-106.ibm.com/developerworks/xml/library/x-think16.html
EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html
The worry about program wizards - http://www.adtmag.com/article.asp?id=7238
Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/developerworks/xml/library/x-tiprdfai.html
Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/library/x-tipcurrent.html
Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.html
SAX filters for flexible processing - http://www-106.ibm.com/developerworks/xml/library/x-tipsaxflex.html


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS