OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] To continue parsing after a fatal error.




> This is a good and necessary reason, although it is not much consolation
> when you are stuck trying to clean up bad XML.  We should have some
> tools to make this easier..

I, for one, am excited to see this. Out of curiousity, if there were a
"scrubber" that weeded out bogus characters before passing on the stream--
would it do anything to handle bogus char refs as well? e.g. &#????; Where
the ref'd char is not a well-formed char. This would produce the same error
but has a different cause-- I bring this up only because it may be an
important distinction down the road for someone like Anoop. The scrubber may
not solve his problem if it is a bogus char ref.

Thanks!
Jeff Rafter
Defined Systems
http://www.defined.net
XML Development and Developer Web Hosting



> -----Original Message-----
> From: Jeff Greif [mailto:jgreif@alumni.princeton.edu]
> Sent: Tuesday, October 23, 2001 12:35 PM
> To: Anoop A V; xml-dev@lists.xml.org
> Subject: Re: [xml-dev] To continue parsing after a fatal error.
>
> Normally you might attempt to deal with this kind of problem using a
> custom
> SAX error handler.  In MSXML3, however, you may not be able to this,
> because
> the underlying parsing code makes all errors fatal (calls the error
> handler's fatalError ()method always, rather than ever calling its
error()
> or ignorableWarning() methods). It appears that treating all errors as
> fatal
> limits recovery options.
> Details (not very many) are here:
> http://msdn.microsoft.com/library/en-
> us/xmlsdk30/htm/isaxerrorhandler_interf
> ace.asp?frame=true
>
> I only looked this up out of curiosity.  I have not tried it myself
and am
> not pretending to be authoritative.
>
> Jeff
> ----- Original Message -----
> From: "Anoop A V" <anoop_scorpio@hotmail.com>
> To: <xml-dev@lists.xml.org>
> Sent: Tuesday, October 23, 2001 10:51 AM
> Subject: [xml-dev] To continue parsing after a fatal error.
>
>
> > Hi,
> > I have an 800 MB file which I need to parse. When I do this using
MSXML
> SAX
> > parser, I get a fatal error with the message "Invalid character
found in
> > text content". And the parsing will be stopped. But I need to
continue
> > parsing the file even if an invalid character is met. I don't mind
if
> that
> > particular node(s) is skipped. But I need to parse the whole file.
This
> file
> > is not under my control, so there is no question of my being able to
> edit
> > this file and remove the invalid characters. Can anybody help?
> >
> > Thanks.
> > Anoop.
> >
> > _________________________________________________________________
> > Get your FREE download of MSN Explorer at
> http://explorer.msn.com/intl.asp
> >
> >
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To subscribe or unsubscribe from this elist use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> >
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.xml.org/ob/adm.pl>