OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] entity references for special characters and the Sax parse

[ Lists Home | Date Index | Thread Index ]

There is no reason why any text should be getting cut off. However, it is
important to remember that the parser is free to break up the text content
into chunks, and it may very well be that for implementation reasons, the
parser is providing the text before and after the entity reference in
separate calls to your ContentHandler's "characters" method. So make sure
you have not coded with the assumption that a call to "characters" is giving
you the entire content of that element.

The typical pattern to use, here, is to maintain an internal StringBuffer,
keep appending characters received in calls to "characters" until you get an
"endElement" call. Only when you get the "endElement" call should you
process the string and clear the buffer.

> -----Original Message-----
> From: Risman, Mark [mailto:mark.risman@ubsw.com]
> Sent: Wednesday, January 16, 2002 3:39 PM
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] entity references for special characters 
> and the Sax
> parser
> > Hi,
> > 
> > 	Has anyone else used the Sax parser to parse a given 
> XML file with the Java method
> > <saxParser>.parse(<filename>)?  If I call this with a file 
> that has an ampersand in it (e.g. &amp;), the rest of the
> > text within that value on either side of the special 
> character will be cut off (I verified this by seeing what appears
> > in the "characters" method).  Has anyone else observed this 
> behavior?
> > 
> 	For example, if I have <tag1>abcd&amp;defg</tag1>, the 
> resulting value for tag1 will be "abcd" or "defg".  If I
> have <tag2>&amp;&gt;</tag2>, the resulting value would be ">".
> 	I am using Xerces Java 1, version 1.3.0 (although 
> version 1.4.4 seems to behave the same as 1.3.0 in this case).
> Any assistance would be greatly appreciated.
> > - Mark
> > 
> Visit our website at http://www.ubswarburg.com
> This message contains confidential information and is intended only 
> for the individual named.  If you are not the named addressee you 
> should not disseminate, distribute or copy this e-mail.  Please 
> notify the sender immediately by e-mail if you have received this 
> e-mail by mistake and delete this e-mail from your system.
> E-mail transmission cannot be guaranteed to be secure or error-free 
> as information could be intercepted, corrupted, lost, destroyed, 
> arrive late or incomplete, or contain viruses.  The sender therefore 
> does not accept liability for any errors or omissions in the contents 
> of this message which arise as a result of e-mail transmission.  If 
> verification is required please request a hard-copy version.  This 
> message is provided for informational purposes and should not be 
> construed as a solicitation or offer to buy or sell any securities or 
> related financial instruments.
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS