I
haven't used Oracle's parser, and the last time I used the Forte 4GL parser was
about 2 years ago, so I can't comment specifically on either of these. However,
one thing you can investigate is whether either of these support the
LexicalHandler interface (which is a SAX2 standard extension). If so, you can
use SAX to parse the document and build the DOM yourself in response to SAX
events. This is more work, but the LexicalHandler interface permits your
application to be notified of CDATA sections. If neither of these support the
LexicalHandler interface, then you may want to explore using another parser.
Sun's Crimson (included in their JAXP distribution), Apache Xerces, and Aelfred
all support this interface. Also, Microsoft's XML SDK version 3.0 and higher
support this interface; if you are running your code on the Microsoft platform,
this may be an option.
I
would advise against trying to write your own XML parser. There are a number of
hidden nuances that are not evident up front; writing an XML parser is not as
trivial as it may appear, at first.
|