OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: HTML2_X.DTD

[ Lists Home | Date Index | Thread Index ]
  • From: Joe English <jenglish@crl.com>
  • To: xml-dev@ic.ac.uk
  • Date: Tue, 17 Jun 1997 12:27:25 -0700

Richard Light <richard@light.demon.co.uk> wrote:

> I'm probably not the only person to have done this, but I had a go at
> XML-izing the HTML 2.0 DTD. [...] 
> However, two issues that remain are the use of '&' in the content model
> for <HEAD>, and the liberal use of inclusion and exclusion exceptions.
> Both are invalid in XML, and neither can be trivially re-mapped to an
> XML-compliant equivalent.  Is anyone else interested in this sort of
> issue?  Any thoughts on how these problems should be addressed?  

For the HEAD content model:


you can get rid of the inclusion exceptions by changing this to:

	( (meta|link)*, 
		(   (TITLE, 	(meta|link)*)
		  & (ISINDEX,	(meta|link)*)?
		  & (BASE,	(meta|link)*)?  ) )

then use the standard transformation on AND groups to get:

    <!ENTITY % head.misc "(META|LINK)*" >
    <!ENTITY % title 	"(TITLE, %head.misc;)">
    <!ENTITY % isindex	"(ISINDEX, %head.misc;)">
    <!ENTITY % base 	"(BASE, %head.misc;)">

	( %head.misc;,
	  (   (%title;,  (  (%isindex; , (%base;)?)
			  | (%base;    , (%isindex;)?))?)
	    | (%isindex;,(  (%title;   , (%base;)?)
			  | (%base;    , %title;)))
	    | (%base;,   (  (%title;   , (%isindex;)?)
			  | (%isindex; , %title;))) ) )   >

(A question of my own: Why does SP complain about e.g., "%base;?"
but not "(%base;)?"  I can't find the reason for this in the Standard.)

Addition of NEXTID, SCRIPT, and STYLE is left as an excercise to 
the reader (GAAAH!).

Or, more sensibly, you can follow Naggum's First Law of AND groups:
If the order doesn't matter, you might as well pick one and stick
with it:


In this case the order does matter to some degree, since there 
are metadata schemes which require groups of METAs and LINKs
to appear in a certain order, so this is probably better:


This is stricter than HTML 2, but most HTML will need to be
modified anyway to be XMLized.

Inclusion and exclusion exceptions have to be treated on a
case-by-case basis.  The exclusion exceptions in HTML 2.0 are
used primarily to limit recursion (e.g., to make sure that an
"A" element can't appear inside another "A"), and in some cases
to undo the effects of inclusion exceptions (e.g., on TITLE and
SELECT to undo the inclusions on HEAD and FORM, respectively).

For the FORM elements you should do what HTML 3.2 does: Instead of
making (INPUT|SELECT|TEXTAREA) inclusions on the FORM element and
then excluding them from SELECT and TEXTAREA, just add them to the '%text;'
parameter entity so they can appear anywhere in content.  (That they 
must appear inside a FORM element is still enforced, but as an 
application convention rather than by the DTD).

Once the inclusions are taken care of, all the exclusions can be 
safely removed, since this yields a less restrictive DTD.

--Joe English


xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)

  • References:
    • HTML2_X.DTD
      • From: Richard Light <richard@light.demon.co.uk>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS