[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [xml-dev] To continue parsing after a fatal error.
- From: Joshua Allen <joshuaa@microsoft.com>
- To: Anoop A V <anoop_scorpio@hotmail.com>, xml-dev@lists.xml.org
- Date: Tue, 23 Oct 2001 12:39:59 -0700
This error should occur with any conforming XML processor. It is quite
likely that the error is caused by a control character in the low ASCII
range. The only way to avoid the problem is to clean up the XML on the
way in, before it is processed by MSXML. And unfortunately I am not
aware of a way to do this without writing code to pipe the input stream
through a scrubber before passing it to MSXML. Julia will know if there
are any code samples existing today (I doubt it).
Thanks,
Joshua
> -----Original Message-----
> From: Anoop A V [mailto:anoop_scorpio@hotmail.com]
> Sent: Tuesday, October 23, 2001 10:51 AM
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] To continue parsing after a fatal error.
>
> Hi,
> I have an 800 MB file which I need to parse. When I do this using
MSXML
> SAX
> parser, I get a fatal error with the message "Invalid character found
in
> text content". And the parsing will be stopped. But I need to continue
> parsing the file even if an invalid character is met. I don't mind if
that
> particular node(s) is skipped. But I need to parse the whole file.
This
> file
> is not under my control, so there is no question of my being able to
edit
> this file and remove the invalid characters. Can anybody help?
>
> Thanks.
> Anoop.
>
> _________________________________________________________________
> Get your FREE download of MSN Explorer at
http://explorer.msn.com/intl.asp
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>