Lists Home |
Date Index |
- From: Chris Hubick <firstname.lastname@example.org>
- To: email@example.com
- Date: Mon, 4 May 1998 13:05:05 +0000 (GMT)
I am writing an XML analization tool upon which I am building a
parser. At the bottom level it reads in XML and generates
start/end/character events based on the productions in the XML spec. For
example, when the parser encounters something matching the Name
production, say the Name "foo", it would generate events:
When it is complete I hope to build an actual parser on top of it.
The analizer can currently read most any document, "all" it is lacking
support for, and which I am working on, is production  markupdecl, and
all of it's dependencies. Now this is where I hit a snag.
In the XML spec at:
> The markup declarations may be made up in whole or
> in part of the replacement text of parameter entities.
> The productions later in this specification for
> individual nonterminals (elementdecl, AttlistDecl,
> and so on) describe the declarations after all the
> parameter entities have been included.
I want the productions for an XML document BEFORE the parameter entities
have been included. I really think the XML spec should have included
productions for before as well as after PEReference inclusion.
I want to do PEReference inclusion at the parser level, not at my lower
"analizer" level, which I want to generate events that directly reflect
what is in the document (before inclusion).
So for my purposes, I need to figure out the grammer for an _unprocessed_
So my first step/idea was to just look at the current grammer, and start
adding PEReferences where I thought necessary:
 elementdecl ::= '<!ELEMENT' S (Name | PEReference) S contentspec S? '>'
 cp ::= (Name | PEReference | choice | seq) ('?' | '*' | '+')?
 Mixed ::= '(' S? '#PCDATA' (S? '|' S? (Name | PEReference))* S? ')*' | '(' S? '#PCDATA' S? ')'
 AttlistDecl ::= '<!ATTLIST' S (Name | PEReference) AttDef* S? '>'
 AttDef ::= S (Name | PEReference) S AttType S DefaultDecl
 AttType ::= StringType | TokenizedType | EnumeratedType | PEReference
 NotationType ::= 'NOTATION' S '(' S? (Name | PEReference) (S? '|' S? (Name | PEReference))* S? ')'
 Enumeration ::= '(' S? (Nmtoken | PEReference) (S? '|' S? (Nmtoken | PEReference))* S? ')'
Where I get really confused is:
 EntityValue ::= '"' ([^%&"] | PEReference | Reference)* '"' | "'" ([^%&'] | PEReference | Reference)* "'"
If, as the spec states, these are the declarations AFTER PE inclusion, how
can there be PEReferences???
Part of the reason I am writing this is to get a better grip on (read
learn) XML. Any guidance would be much appreciated, thanks!
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)