OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Problem parsing XML file with Xerces-J

[ Lists Home | Date Index | Thread Index ]

I'm glad you've got it working. Looks good.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Midsummer Sun [mailto:midsummer.sun@gmail.com] 
> Sent: 01 April 2005 08:35
> To: Michael Kay
> Cc: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] Problem parsing XML file with Xerces-J
> 
> > I think pre-editing of response XML (i.e. stripping DTD 
> declration) is
> > more better "for me". For my requirement, DTD in the XML is 
> useless to
> > me. Implementing EntityResolver imposes significant performance
> > overhead to my program. The parser is always pooling for callback
> > events.. So I think pre-editing by a simple string method is far
> > efficient..
> 
> I amend my above observation slightly..
> 
> My program is doing:
> DocumentBuilderFactoryImpl factory = new DocumentBuilderFactoryImpl();
> DocumentBuilder builder = factory.newDocumentBuilder();
> Document document = builder.parse(new InputSource(new 
> StringReader(rsp)));
> 
> So I am using a DOM parser! But a DOM parser underneath is probably
> using a SAX handler (to implement a DOM). i.e. a SAX handler is
> despatching events to the DOM parser, as it is reading the XML
> document. And DOM implementation is constructing a DOM object by
> "assembling input from SAX implementation". I read this in a nice
> article somewhere.
> 
> My class implements EntityResolver interface, and calls
> builder.setEntityResolver(obj); i.e. it registers the class object
> itself(obj) as a handler for EntityResolver. This is probably a very
> lightweight reference within JVM, and is nothing expensive worth
> worrying about..
> 
> So the DOM parser starts to parse the document. If it encounter a DTD
> reference it will call resolveEntity method. It will probably call
> this method after a full DOM tree is constructed (so that all entity
> references can be resolved). The calling of resolveEntity method will
> only be one time. So there I no expensive processing going on, as I
> thought before ;)
> 
> Please do correct me if I am wrong.
> 
> If  the resource consumption by implementing EntityResolver is same as
> the pre-editing solution(or there is a very marginal difference), I'll
> prefer implementing the EntityResolver interface! It could be a USP in
> my application!
> 
> I am eagerly waiting for your opinion.
> 
> Best regards,
> 






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS