OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] best xml parser to use

I'd agree with Justin on these points, there is excess that can be avoided. The unused classes generated by xmlbeans would simply sit idly by as the handful of classes that are necessary do all the heavy lifting.
The finer point I would add however, is that the generation of the code is a one time effort. However, that only holds as long as the schema, or structure, of the xml you're processing never changes. 
One point of discussion I'll add is that I believe the coding necessary to accomplish the task described would be most efficient in that the xmlbeans implementation will do alot of the heavy lifting for what you need to do, without you having to code many details explicitly as I believe you would need to do with the strictly DOM focused approach.
A big Caveat however, is that I haven't used XOM, and JDOM and dom4j only minimally compared to xmlbeans, I'm sure others out there have more experience and can weigh in on that thought.
Overall, however, I believe the leanest and meanest approach for this in terms of performance and resource consumption is a stax implementation.  

On 9/6/06, Justin Edelson <justinedelson@gmail.com> wrote:
I'd go with a DOM-style API (either DOM, JDOM, dom4j, or XOM). The files aren't that big. Using SAX or StAX will require that you build up the extracted string manually, whereas with DOM the process looks like:

1) parse document
2) loop through invoice child elements
3) serialize each child element to a String and post

You could use xmlbeans or another data binding framework, but then you're just subsituting the generic DOM data model for a schema-specific data model. It doesn't sound like you care about the internal structure and content of an invoice, so generating the new Java classes required for data binding is unnecessary overhead.

On 9/6/06, K. W. Landry < kwlandry@gmail.com > wrote:
If you're coding in Java I'd suggest xmlbeans. I've found xmlbeans fast, easy, quick to employ; very handy in about half a dozen projects now.
You need to compile the schema which returns java code that will then allow you to directly reference any element. Then, simply reference the invoice structure's topmost element, and then do as you wish, either write the xml to the queue as simple text, or create a new xml document (just provides the xml header at the start of the file) and add only this copied element to it and write to the queue, or strip all or selected, etc..., etc..., and write to the queue and iterate to the next invoice or batch file. It could be 20 lines of code tops.
If you don't have a schema to feed into the schema compiler there are a couple of tools that you can build a schema and a couple that infer schema from sample xml.

On 9/6/06, petera <peter.anderson@egsgroup.com > wrote:


I have a particular problem to solve:

I have an xml batch file that contains individual xml invoices. I need to
extract these xml invoices one at a time and
place them on a message queue i.e. I just need to get all the data between
the invoice start and end tags put it
in a string and place it on a message queue (validation occurs on the
invoice itself on the receiver side).

What is likely to be my best approach, DOM (unlikely I guess), SAX, StAX or
simply writing a java program using indexOf, in terms of performance ?

TIA Peter

View this message in context: http://www.nabble.com/best-xml-parser-to-use-tf2226882.html#a6171113
Sent from the Xml.org Dev forum at Nabble.com.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS