OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Parser Question.

[ Lists Home | Date Index | Thread Index ]
  • From: "Brown, Bryan" <bryanb@upshot.com>
  • To: "'xml-dev@xml.org'" <xml-dev@xml.org>
  • Date: Wed, 31 May 2000 15:14:22 -0700

I have written a parser and I have a couple of questions that someone might
be kind enough to answer for me.

Question 1.
In the XML spec it states that the internal dtd subset occurs before an
external dtd subset if both are declared.

So if I have an external dtd like the following mydtd.dtd
<!NOTATION gif PUBLIC "gifviewer.exe">

and a document like
<DOCTYPE doc SYSTEM "mydtd.dtd" [
<!ENTITY picture SYSTEM "picture.gif" gif>

A parser should issue an error because the entity decl in the internal
subset references a notation "gif" which has not yet been declared. But in
the XML conformance tests in one of the sun tests a very similiar example
arises and the Conformace test says that it should be a valid document. Is
this correct ?

Question 2.
There is a validity constraint on the Standalone decl, which seems
complicated to implement, and the XML spec goes out of the way to note that
the standalone decl only denotes the presence of external stuff, it makes no
statement as to parser behaviour. ( this is also mentioned in Tim Bray's
annotated spec).

So the question is, if you are validating do you do anything with the
Standalone Decl, and if so when in the parsing process (everytime you parse
an entity,attribute, and element content)?

Question 3.
In the external subset PEs can occur anywhere, this seems to me to make the
process of parsing an external production very complicated because I need to
check at each step of the way if the next token is a PE reference, does
anyone have a better way ? Why is this allowed in the spec is there really
that much value in being able to specify

<!ENTITY % e2 "(e3|e4)">
<!ELEMENT e2 %e2;>

instead of
<!ENTITY % e2 "<!ELEMENT e2 (e3|e4)>">

Seems to me that is the same effect without having to make a parser
implementors life a hell of alot more difficult.

Question 4.
The spec states that if a SystemLiteral has a fragment identifier the parser
may signal an error, yet the conformance tests offer a document which is
supposed not-wf because it has a SystemLiteral with a fragment identifier.
So is it an error or is it the parser option what to do ?

Question 5.
Is checking to make sure element content is deterministic an option or
required for a conforming parser ?


This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS