OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   SAX compatibility & WF parsing questions

[ Lists Home | Date Index | Thread Index ]
  • From: Juergen Modre <jmodre@edu.uni-klu.ac.at>
  • To: xml-dev@ic.ac.uk
  • Date: Sun, 07 Jun 1998 21:09:30 +0000

Hello,

Here are few questions to the SAX interface and XML parsing which
arised when I implemented the SAX interface & compared the already
existing implementations.

Thanks for any hints.
And forgive me if something is sun-clear for everybody except me.

1.) Parser.java
Parser.java has the following javadoc header:
  * <p>All SAX parsers must also implement a zero-argument constructor
  * (though other constructors are also allowed).</p>
What does this mean for this case?

2.) SAX callback events
For which parts of an XML document should a SAX
compatible parser give SAX callbacks?

Looking at the XML start production
[1] document ::= prolog element Misc*

a) only from root-element to end of root element (= element production)
b) from root-element to end of XML file (= element and Misc* production)
c) the whole XML file (whole document production)

Well, at least always the DTDHandler.notationDecl() and
DTDHandler.unparsedEntityDecl() methods must be called
always outside the element production, but which one
is the correct way?

3.) Return value of systemId and publicId
In the SAX documentation there is often the <p>
  * <p>If the system identifier is a URL, the SAX parser must
  * resolve it fully before reporting it to the application.</p>

Does a SAX conformant parser now need to return always the
"absolute URI" for the parameters systemId and publicId?
e.g.
If defined:
  <!NOTATION BMP SYSTEM "abc.exe">
The SAX parser must for instance return:
  <!NOTATION BMP SYSTEM "file:/C:/Files/XML-Files/abc.exe">

Is this the meaning of this <p>?

4.) WF parsing and: characters vs. ignorableWhitespace
Looking at the XML start production
[1] document ::= prolog element Misc*

For prolog and Misc a parser should always return
ignorableWhitespace.
For the parts in the element production and WF parsing:

a.) always charData
b.) always ignorableWhitespace
c.) or must be DTD aware, which means charData or
ignorableWhitespace according to the DTD

5.) ByteStreamDemo.java
When launching this example it gives a false usage hint:
Usage: java -Dorg.xml.sax.parser=<classname> SystemIdDemo <document>
should be
Usage: java -Dorg.xml.sax.parser=<classname> ByteStreamDemo <document>

6.) EntityResolver.java
I know that SAX 1.0 is finalized now but I think the name "resolveExternalEntity"
would be better in this case than "resolveEntity" :-).

-----------------------------------------------
 JUERGEN MODRE
 Reisdorf 6
 A-9371 Brueckl
 Austria (Europe)

 Phone:   ++43 4214 2320
 Mobile:  ++43 664 233 22 22
 E-mail:  jmodre@edu.uni-klu.ac.at
 WWW:     http://www.edu.uni-klu.ac.at/~jmodre
-----------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS