OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Parser Behaviour (serious)

[ Lists Home | Date Index | Thread Index ]
  • From: Peter Murray-Rust <peter@ursus.demon.co.uk>
  • To: xml-dev@xml.org
  • Date: Sun, 02 Apr 2000 10:22:51 +0100

<Note>This message is serious<smiley/></Note>

I have been preparing a large amount of XHTML (for our VirtualXML activity)
and using Dave Raggett's excellent tidy program (with option -asxml) to
produce XHTML files of the sort:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns="http://www.w3.org/1999/xhtml">
<title>Test page</title>
<p>A test</p>

These files work fine as HTML, and are conforming XML 1.0, but when I try
to parse them on my laptop using either AElfred or Xerces I get:

java.net.UnknownHostException: www.w3.org

What's wrong? Ah! The parser is trying to resolve the URL for the DTD and
since I'm offline (connections cost money over here) it can't. So the file
I have created can only be processed as XML if:
	(a) I am connected online
	(b) the W3C maintain *** for all time *** a means of dereferencing either
the FPI or the URL

I can't believe this is what the community wants. It fooled me, and I've
been working with XML for some time.

I still believe that undefined parser behaviour is going to be a major
deterrent to may people who want to take up XML. I have posted on this
before. I am going to keep on about it. The most common reaction I seem to
have so far is "Well that's how XML behaves - it's *your* problem to decide
how to process XML". This isn't good enough. In the current case I simply
want to switch off the parser's attempt to resolve the DTD. I would
appreciate something like:

	"Parser failed to resolve external SYSTEM identifier in DOCTYPE:
	  To disable DTD look-up use -nosysid option"

So, for about the third time (and it took 3 times to get SAX1.0 off the
ground, what are we going to do about specifying parser behaviour? I have
shown in public how the failure to process external entities breaks systems. 

Until we resolve this question (and probably several others), XML 1.0 is
broken as an interoperable "standard"


[No criticism is aimed at Dave Raggett, who has written a splendid tool, or
the W3C who actually have a real DTD mounted at the URL mentioned. Nor to
the authors of the parsers who have done their best to provide a default
behaviour, and in the absence of any guidance have required their parsers
to access an external DTD (very reasonably).

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS