[
Lists Home |
Date Index |
Thread Index
]
- From: Chris Hubick <hubick@medlib.com>
- To: Xml-Dev <xml-dev@ic.ac.uk>
- Date: Wed, 03 Dec 1997 18:11:19 -0700
I am writing a recursive descent XML parser in Java and have
a couple questions....
The XML Working Draft dated 17-November-1997 states:
[24] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?
[28] Misc ::= Comment | PI | S
[19] PI ::= '<?' Name (S (Char* - (Char* '?>' Char*)))? '?>'
[25] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
[79] EncodingPI ::= '<?xml' S 'encoding' Eq QEncoding S? '?>'
Within a PI is the Name "xml" reserved? If it is, should
there not be a [wfc] on PI stating so?
By the current definition any XMLDecl and EncodingPI is also
a valid PI. In a prolog an XMLDecl is optional, and is followed
by Misc, which includes PI.
Ok, so I have can have an XML file with no XMLDecl
(it's optional) followed by "<?xml version="blah" encoding=5?>" which
matches PI, in my Misc*. And this is legal? My parser will
take this just fine as such, but I wonder about the others.
It makes detecting a bad XMLDecl impossible! My parser will just
say fine, that wasn't an XMLDecl, and feed it to Misc, which will
most likely match (or possibly spew) it as a PI.
Shouldn't [19] PI have an S? at the end before '?>' ?
Also shouldnt PCData be:
[17] PCData ::= [^<&]+
rather than the current:
[17] PCData ::= [^<&]*
[44] content ::= (element | PCData | Reference | CDSect | PI | Comment)*
because:
<TEST>This is a test</TEST>
In my recursive descent parses to:
<Element>
<STag><TEST></STag>
<content>
<PCData>This is a test</PCData>
<PCData></PCData>
<PCData></PCData>
<PCData></PCData>
<PCData></PCData>
<PCData></PCData>
...
And we get infinite matches on a zero length PCData.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|