OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] SAX Parsing of an XML file



Very briefly, your application would implement the ContentHandler
interface with something like:

boolean processingInfoContent=false;
StringBuffer infoText = new StringBuffer();

public void startElement(String uri, String localName, String qname,
AttributeList atts)
{
   if ((uri == null) && (localName.equals("info")))
      processingInfoContent = true;
}

public void characters(char[] ch, int start, int length)
{
   if (processingInfoContent)
      infoText.append(ch, start, length);
}

public void endElement(String uri, String localName))
{
   if ((uri == null) && (localName.equals("info")))
      processingInfoContent = false;
}

One thing to remember here is that the whitespace contained in infoText
will depend on whether you are using a validating parser. If so, and the
content model of <info> is element-only (as opposed to mixed), then the
"ignorable" whitespace will be reported in
ContentHandler.ignorableWhitespace and infoText will be:

"Metoday"

. If not, and the parser chooses to ignore the DTD (if there is one),
infoText will include all white space -- that is, it will be:

"
      
        Me
        today
      
    "

-- Ron

Paul Johnston wrote:
> 
> Please excuse my ignorance of XML.
> 
> I've been using "XML" in various forms for a while now, but have been using DOM most of the time to do the work.
> 
> I've just picked up SAX and it's brilliant for the right purpose, however...
> 
> I want to be able to grab all of the character data for a specific tag (this is dummy XML and not very good dummy XML at that):
> 
> <xml>
>   <mytag>
>     <info>
>       <doc>
>         <author>Me</author>
>         <date>today</date>
>       </doc>
>     </info>
>     <importantbit>
>        .... doesn't matter what's in here ....
>     </importantbit>
>   </mytag>
> </xml>
> 
> My question is (using JAXP) how can I retrieve all the text in the <info></info> tags?  There isn't a simple way of doing it straight from the API.
> 
> Just wondering...
> 
> Paul
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

-- 
Ronald Bourret
Programming, Writing, and Training
XML, Databases, and Schemas
http://www.rpbourret.com