XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] SGML default attributes.

SGML requires the use of a DTD--there was no notion of a "default" DTD.
This requirement was, I'll argue, the result of a fundamental conceptual
mistake--understandable at the time but a mistake nevertheless.

The conceptual mistakes that SGML made was conflating the notion of an
abstract "document type" with the grammar definition for (partially)
validating documents against that document type. That is, SGML saw the DTD
as being equal to the definition of the "document type" as an abstraction.
But of course that is nonsense. There was (remains today) the misguided
notion that a reference to an external DTD subset somehow told you
something actionable about the document you had. But of course it tells
you nothing reliable because the document could define it's "real" DTD in
the internal subset or the local environment could put whatever it wants
at the end of the public ID the document is referencing.

Consider this SGML document:

<!DOCTYPE notdocbook PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" [
  <!ELEMENT notdocbook ANY >
  <!ELEMENT bogus ANY >
]>
<notdocbook>
  <bogus><para>This is not a DocBook document</para></bogus>
</notdocbook>

This document will be taken as a DocBook document by any tool that thinks
the public ID means something. But obviously it is not a DocBook document.
It is, however, 100% DTD valid. QED DTDs are useless as tools of document
type definition. The only reason the SGML (and now XML world) didn't
collapse under this fact is that the vast majority of SGML and XML
authoring and management tools simply refused to preserve internal subsets
(going back to the discussion about DynaBase's problems with entity
preservation).

Standoff grammars like XSD and RELAX NG at least avoid the problem of
internal DTD subsets but they still fail to serve as reliable definitions
of document types in abstract because they are still only defining the
grammar rules for a subset of all possible conforming documents in a
document document type.

Because of features like tag omission, inclusion exceptions, and short
references, it was simply impossible to parse an SGML document without
having both its DTD and its SGML declaration (which defined the lexical
syntax details). There is a default SGML declaration, but not a default
DTD. 

A lot of what we did in XML was remove this dependency by having a fixed
syntax and removing all markup minimization except attribute defaults.

XML does retain one markup minimization feature, attribute defaults.
Fortunately, both XSD and RELAX NG provide alternatives to DTDs for
getting default attribute values.

Cheers,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 5/4/16, 6:16 AM, "Norman Gray" <norman@astro.gla.ac.uk> wrote:

>
>Greetings.
>
>(catching up ...)
>
>On 29 Apr 2016, at 17:58, John Cowan wrote:
>
>> On Fri, Apr 29, 2016 at 8:54 AM, Norman Gray <norman@astro.gla.ac.uk>
>> wrote:
>>
>> In the XML world, the DTD is just for validation
>>
>>
>> That turns out not to be the case.  There are a number of XML DTD
>> features
>> which affect the infoset returned by a compliant parser.  If they are
>> in
>> the internal subset, the parser MUST respect them;
>
>I stand corrected; I was sloppy.  I think this doesn't change my
>original point, however, which was that in SGML the DTD was integral to
>the document, and to the parse of the document, and that it's easy to
>forget this after one has got used to two decades of XML[1].  I can't
>remember if there was a trivial or default DTD which was assumed in the
>absence of a declared one, in the same way that there was a default SGML
>Declaration, but taking advantage of that would probably have been
>regarded as a curiosity, rather than normal practice.
>
>In XML, in contrast, the DTD has a more auxiliary role, and at a first
>conceptual look, that role is validation (even though -- footnote! -- it
>may change other things about the parse as well).  Thus _omitting_ an
>XML DTD (or XSchema) is neither perverse nor curious.
>
>Practical aspect: When I'm writing XML, I use a DTD (in whatever syntax)
>to help Emacs tell me if the document is valid, but I don't even know
>whether the XML parsers I use are capable of using a DTD external
>subset.  That careless ignorance would be impossible with SGML.
>
>The rational extension of that attitude, of course, is MicroXML, which
>(as you of course know) doesn't use any external resources at all, and
>doesn't care about validation.
>
>Best wishes,
>
>Norman
>
>
>[1] Hang on, _two_ decades?!  I've just checked and ... 1996 doesn't
>seem that long ago.
>
>
>-- 
>Norman Gray  :  https://nxg.me.uk
>SUPA School of Physics and Astronomy, University of Glasgow, UK
>
>_______________________________________________________________________
>
>XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>to support XML implementation and development. To minimize
>spam in the archives, you must subscribe before posting.
>
>[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>subscribe: xml-dev-subscribe@lists.xml.org
>List archive: http://lists.xml.org/archives/xml-dev/
>List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS