OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Parser Behaviour (serious)

[ Lists Home | Date Index | Thread Index ]
  • From: Peter Murray-Rust <peter@ursus.demon.co.uk>
  • To: xml-dev@xml.org
  • Date: Sun, 02 Apr 2000 10:22:51 +0100

<Note>This message is serious<smiley/></Note>

I have been preparing a large amount of XHTML (for our VirtualXML activity)
and using Dave Raggett's excellent tidy program (with option -asxml) to
produce XHTML files of the sort:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Test page</title>
</head>
<body>
<p>A test</p>
</body>
</html>

These files work fine as HTML, and are conforming XML 1.0, but when I try
to parse them on my laptop using either AElfred or Xerces I get:

java.net.UnknownHostException: www.w3.org

What's wrong? Ah! The parser is trying to resolve the URL for the DTD and
since I'm offline (connections cost money over here) it can't. So the file
I have created can only be processed as XML if:
	(a) I am connected online
	(b) the W3C maintain *** for all time *** a means of dereferencing either
the FPI or the URL

I can't believe this is what the community wants. It fooled me, and I've
been working with XML for some time.

I still believe that undefined parser behaviour is going to be a major
deterrent to may people who want to take up XML. I have posted on this
before. I am going to keep on about it. The most common reaction I seem to
have so far is "Well that's how XML behaves - it's *your* problem to decide
how to process XML". This isn't good enough. In the current case I simply
want to switch off the parser's attempt to resolve the DTD. I would
appreciate something like:

	"Parser failed to resolve external SYSTEM identifier in DOCTYPE:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
	  To disable DTD look-up use -nosysid option"

So, for about the third time (and it took 3 times to get SAX1.0 off the
ground, what are we going to do about specifying parser behaviour? I have
shown in public how the failure to process external entities breaks systems. 

Until we resolve this question (and probably several others), XML 1.0 is
broken as an interoperable "standard"

	P.

[No criticism is aimed at Dave Raggett, who has written a splendid tool, or
the W3C who actually have a real DTD mounted at the URL mentioned. Nor to
the authors of the parsers who have done their best to provide a default
behaviour, and in the absence of any guidance have required their parsers
to access an external DTD (very reasonably).




***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS