OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xml-dev] To continue parsing after a fatal error.

At 12:39 PM 23/10/01 -0700, Joshua Allen wrote:
>This error should occur with any conforming XML processor.  

Indeed.  In fact the whole notion of trying to parse an 800M
doc in one pass seems questionable to me, for exactly the 
problem you're having; in the real world errors will sometimes
happen and you'd like to lose the minimum data.  And Joshua's
right, there's no good way to make an XML parser ignore errors.

In particular, you should challenge whoever's sending you
this stuff because if they claim they're sending you XML
they're incorrect.  There is a very formal and precise
definition of what XML is and your parser is correctly 
claiming that what they're sending you isn't, that's why
it's refusing to process it.

On the other hand, if the situation is that the document
looks like

 .... repeat many times ...

Then a very simple pass-through filter can partition this as it 
goes by and make the downstream code see it as a large number of
more reasonably sized <record> docs rather than as one big <file>
doc.  And if one of the <record> docs contains garbage and blows
up, that's probably a little less painful than losing the whole
800M.   -Tim