OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Partial documents in tree-based APIs

[ Lists Home | Date Index | Thread Index ]

Streaming APIs like SAX and XMLPULL by their nature provide some of 
the content of a malformed document to the client application before 
the first well-formedness error is detected. The XML specification 
implicitly says this is OK, though in some use-cases roll-back or 
failure to commit may be desirable.

Now consider the case of a tree-based API such as DOM, JDOM, or XOM 
which encounters a malformedness error. Traditionally, these APIs 
have reported no information from a malformed document to the client 
application. However, recently Laurent Bihanic submitted a patch to 
JDOM in which as much of the document as had been able to be 
successfully parsed was made available through the exception that was 
thrown to indicate the malformedness error. This was quite clever. It 
had never occurred to me, and I had never noticed any other API do 
anything similar.

What I'd like to get broader discussion of is whether this is a good 
idea. There are certainly use cases for it. Bihanic wanted to read 
the envelope of an XML message even if the data was malformed. 
However, there are also problems. For instance, if the 
well-formedness error is a missing end-tag, then the element with the 
missing end-tag will still appear in the partial tree. And if the 
problem is a missing root element, then this may produce a Document 
object with no root element. On the other hand, rollback, failure to 
commit, or simply ignoring the malformed document is much easier than 
with a streaming API since you know in advance that the document is 
malformed.

Is this approach something to be encouraged? Should other tree-based 
APIs like XOM and DOM copy this innovation? What advantages and 
disadvantages have I not thought of?
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|           Processing XML with Java (Addison-Wesley, 2002)          |
|              http://www.cafeconleche.org/books/xmljava             |
| http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS