xml-dev - Re: [xml-dev] XML's Scylla and Charybdis

Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp

[ Lists Home | Date Index | Thread Index ]

To: "Simon St.Laurent" <simonstl@simonstl.com>
Subject: Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp
From: Uche Ogbuji <uche.ogbuji@fourthought.com>
Date: Tue, 01 Apr 2003 14:31:21 -0700
Cc: xml-dev@lists.xml.org
In-reply-to: Message from "Simon St.Laurent" <simonstl@simonstl.com> of "Tue, 01 Apr 2003 12:21:51 EST." <r01050400-1024-6D677DC4646611D7B1500003937A08C2@[192.168.124.11]>
Sender: Uche Ogbuji <uche.ogbuji@fourthought.com>

> dareo@microsoft.com (Dare Obasanjo) writes:
> >In fact, i'd go as far as to say that applications that 
> >depend on full lexical round tripping (i.e. preserving all whitespace, 
> >treating attribute order as significant, etc) are in violation of the 
> >spirit if not the letter of the XML 1.0 recommendation. 
> 
> Given that you have divined a data model for XML that isn't actually
> specified as the XML data model anywhere, I'm pretty wary of your
> interpretations of the spirit of XML 1.0.
> 
> The XML 1.0 recommendation does say: "Note that the order of attribute
> specifications in a start-tag or empty-element tag is not significant."
> It does not say "parsers must discard the order and scramble the
> attributes."

Hear hear.

> >However I agree there are edge cases such as XML editors where such 
> >behavior is not just desirable but required. 
> 
> There seems to have been a movement early on, especially with the DOM,
> to chop out "editor-only" functionality.  I'm not sure that was such a
> brilliant move in retrospect.

Ha.  Who needs retrospect.  As I recall, the danger here was clearly discussed 
even while DOM Level 1 was in progress.  The argument that allowed these 
issues to be set aside was always "we'll deal with it in another DOM level".  
Of course that never happened.

This whole thread has been a sort of möbius strip for me.  I find the idea of 
regexen for general and primary XML processing apalling, yet I agree with many 
regexen boosters that using XML APIs such as SAX and DOM too often mangles XML 
documents at the lexical level.  I find the idea of "The XML Data Model"(TM) a 
laughable fiction and yet I feel it is vey important for the most common tool 
sets to operate on XML models such as SAX and XPath which do omit lexical 
fidelity.

I must say that this thread does give me very interesting ideas about 
next-generation XML processing models in Python, so I gues I'm grateful.

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use internal references in XML vocabularies - http://www-106.ibm.com/developerworks/xml/library/x-tipvocab.html
Universal Business Language (UBL) - http://www-106.ibm.com/developerworks/xml/library/x-think16.html
EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html
The worry about program wizards - http://www.adtmag.com/article.asp?id=7238
Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/developerworks/xml/library/x-tiprdfai.html
Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/library/x-tipcurrent.html
Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.html
SAX filters for flexible processing - http://www-106.ibm.com/developerworks/xml/library/x-tipsaxflex.html

References:
- RE: [xml-dev] XML's Scylla and Charybdis - parse and regexp
  - From: "Simon St.Laurent" <simonstl@simonstl.com>

Prev by Date: Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp
Next by Date: Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp
Previous by thread: Re: [xml-dev] XML's Scylla and Charybdis - parse and regexp
Next by thread: RE: [xml-dev] XML's Scylla and Charybdis - parse and regexp
Index(es):
- Date
- Thread