[
Lists Home |
Date Index |
Thread Index
]
- From: Rick JELLIFFE <ricko@geotempo.com>
- Date: Thu, 19 Oct 2000 03:51:11 +0800
Sean McGrath wrote:
> 2) HTML is not not now and never was, an SGML application. Ever try
> feeding a general entity reference to a web browser? Ever try getting
> a html editor to create a CDATA section? HTML browsers skip
> over tags they don't recognise. Where in the SGML standard does
> it say that you can do that?
SGML allows a default ENTITY value. This could reference a PI which
reproduces the text of the entity reference. Or document an
error-recovery strategy an additional requirements document referenced
from the SEEALSO parameter.
If you want marked section delimiters to be ignored, map the <![
delimiter (MSS) to something else.
But why would anyone want to? One of the points of SGML is to allow
rigourous description of a language so that generic tools can be used on
archived data.
But after several years of tidy, Dave Ragget has still not captured the
syntax of HTML as actually practised. The fact is that there are many
HTMLs, and some of these are more amenable to be treated as SGML than
others.
> 3) XML is an SGML application only after you change SGML a bit.
> Many - not all - but many SGML tools that predate this, um, adaption
> of the SGML standard will not process XML correctly. (Most software
> based on James Clarks awesome SP engine will. Most of the rest
> won't.)
Not every SGML tool has to be able to process every SGML document.
ISO 8879 does not mandate any error-recovery policy. That most parser
were Draconian was because that was their default behaviour: you need to
tailor them to the particular DTD or syntax to get nice error-recovery.
I have always thought that HTML was as much an error-recovery strategy
as a DTD. (And if that is true, then XHTML really misses the point!)
> Are XML and HTML proper subsets of SGML
> in any meaningful sense?
>
> Not in this universe.
Not very convincing.
XML is designed to be a proper subset of WebSGML; that it defines things
in addition to SGML is irrelevant. One can make WebSGML documents that
are HTML, and one can describe the extra features of most HTMLs in terms
of error-recovery or additional requirements using the mechanisms
WebSGML provides. But Sean is right that HTML is not SGML, but because
HTML as practised is a class of languages not a single language rather
than no HTML is SGML.
Rick Jelliffe
|