OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Partial documents in tree-based APIs

[ Lists Home | Date Index | Thread Index ]

I would be much more interested in any innovation that allowed a parser
to report more that one well-formedness error in a single run.

Michael Kay
Software AG
home: Michael.H.Kay@ntlworld.com
work: Michael.Kay@softwareag.com 

> -----Original Message-----
> From: Elliotte Rusty Harold [mailto:elharo@metalab.unc.edu] 
> Sent: 05 April 2003 15:24
> To: xml-dev@lists.xml.org
> Cc: Laurent Bihanic
> Subject: [xml-dev] Partial documents in tree-based APIs
> Streaming APIs like SAX and XMLPULL by their nature provide some of 
> the content of a malformed document to the client application before 
> the first well-formedness error is detected. The XML specification 
> implicitly says this is OK, though in some use-cases roll-back or 
> failure to commit may be desirable.
> Now consider the case of a tree-based API such as DOM, JDOM, or XOM 
> which encounters a malformedness error. Traditionally, these APIs 
> have reported no information from a malformed document to the client 
> application. However, recently Laurent Bihanic submitted a patch to 
> JDOM in which as much of the document as had been able to be 
> successfully parsed was made available through the exception that was 
> thrown to indicate the malformedness error. This was quite clever. It 
> had never occurred to me, and I had never noticed any other API do 
> anything similar.
> What I'd like to get broader discussion of is whether this is a good 
> idea. There are certainly use cases for it. Bihanic wanted to read 
> the envelope of an XML message even if the data was malformed. 
> However, there are also problems. For instance, if the 
> well-formedness error is a missing end-tag, then the element with the 
> missing end-tag will still appear in the partial tree. And if the 
> problem is a missing root element, then this may produce a Document 
> object with no root element. On the other hand, rollback, failure to 
> commit, or simply ignoring the malformed document is much easier than 
> with a streaming API since you know in advance that the document is 
> malformed.
> Is this approach something to be encouraged? Should other tree-based 
> APIs like XOM and DOM copy this innovation? What advantages and 
> disadvantages have I not thought of?
> -- 
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |           Processing XML with Java (Addison-Wesley, 2002)          |
> |              http://www.cafeconleche.org/books/xmljava             |
> | http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA  |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
> +----------------------------------+---------------------------------+
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org 
> <http://www.xml.org>, an initiative of OASIS 

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS