OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Simple XML conformance

[ Lists Home | Date Index | Thread Index ]
  • From: Peter Murray-Rust <peter@ursus.demon.co.uk>
  • To: "'XML Dev'" <xml-dev@ic.ac.uk>
  • Date: Mon, 17 Jan 2000 22:19:28 +0000

I have been preparing a set of XML documents and a collection of XML-aware
tools to introduce newcomers to XML (on our VirtualXML course). I have
encountered a surprisingly number of cases where an XML tool is unable to
read an XML document. [There is not meant to be anything tricky here since
Henry and I are actually trying to demonstrate how to learn XML by doing.
We are not looking to "torture" the tools - more the reverse.]

As a collection of XML documents I took:
	Jon Bosak's Shakespeare	(elements and DTD)
	http://www.w3.org/TR/1998/REC-xml-19980210.xml 		(elements, attributes and
DTD (with PEs))
	http://www.w3.org/TR/DOM-Level-2/			(elements, attributes, entities and
DTD(with PEs and GEs)) [I point out that this is an excellent document for
showing a wide range of XML constructs in a meaningful way.]
	(and a number of examples distributed with tools, including my own).

Here are some of the problems ( I will not list the tools explicitly)
	- tool threw a fatal error because <?xml version="1.0"?> was absent
	- tool threw a fatal error because <!DOCTYPE was missing
	- REC-xml and DOM specify DTD but spec.dtd is not mounted
	- One content model in spec.dtd appeared to be inconsistent with the
REC-xml (I may have th wrong spec.dtd but it was downloaded from w3.org)
	- one tool "skipped" general entity references (i.e. did not expand them)
and threw a content model error
	- one tool regarded undeclared parameter entities in comments (in
spec.dtd) as errors
	- several tools regard the absence of a DTD as a fatal error (i.e. they
appear to be validating by default).

As an example, I believe that it is likely that many tools when pointed at:
will fail. 

I expect that by tweaking some of the tools with commandline switches I
might be able to alter their behaviour, but I am slightly surprised that
some tools will only read validatable files (e.g. the file 

<greeting>Hello World</greeting>

is often not readable (unless "edited" to:

<?xml version="1.0"?>
<!DOCTYPE greeting [
<!ELEMENT greeting (ANY)>
<greeting>Hello World</greeting>
Is there a definitive resource anywhere which explicitly states what
behaviour can be expected from various types of parsers? I know it is
inferable from the spec, but I suspect that not all implementers have taken
identical interpretations. I would ideally like to have a matrix of parsers
against standard "correct" [not always "valid"] documents and see how many

Henry and I are obviously keen to show that XML is simple to use with the
correct tools and that interoperability is achievable. 



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS