Re: [xml-dev] SGML default attributes.
SGML requires the use of a DTD--there was no notion of a "default" DTD.
This requirement was, I'll argue, the result of a fundamental conceptual
mistake--understandable at the time but a mistake nevertheless.
The conceptual mistakes that SGML made was conflating the notion of an
abstract "document type" with the grammar definition for (partially)
validating documents against that document type. That is, SGML saw the DTD
as being equal to the definition of the "document type" as an abstraction.
But of course that is nonsense. There was (remains today) the misguided
notion that a reference to an external DTD subset somehow told you
something actionable about the document you had. But of course it tells
you nothing reliable because the document could define it's "real" DTD in
the internal subset or the local environment could put whatever it wants
at the end of the public ID the document is referencing.
Consider this SGML document:
<!DOCTYPE notdocbook PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" [
<!ELEMENT notdocbook ANY >
<!ELEMENT bogus ANY >
]>
<notdocbook>
<bogus><para>This is not a DocBook document</para></bogus>
</notdocbook>
This document will be taken as a DocBook document by any tool that thinks
the public ID means something. But obviously it is not a DocBook document.
It is, however, 100% DTD valid. QED DTDs are useless as tools of document
type definition. The only reason the SGML (and now XML world) didn't
collapse under this fact is that the vast majority of SGML and XML
authoring and management tools simply refused to preserve internal subsets
(going back to the discussion about DynaBase's problems with entity
preservation).
Standoff grammars like XSD and RELAX NG at least avoid the problem of
internal DTD subsets but they still fail to serve as reliable definitions
of document types in abstract because they are still only defining the
grammar rules for a subset of all possible conforming documents in a
document document type.
Because of features like tag omission, inclusion exceptions, and short
references, it was simply impossible to parse an SGML document without
having both its DTD and its SGML declaration (which defined the lexical
syntax details). There is a default SGML declaration, but not a default
DTD.
A lot of what we did in XML was remove this dependency by having a fixed
syntax and removing all markup minimization except attribute defaults.
XML does retain one markup minimization feature, attribute defaults.
Fortunately, both XSD and RELAX NG provide alternatives to DTDs for
getting default attribute values.
Cheers,
Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com
On 5/4/16, 6:16 AM, "Norman Gray" <norman@astro.gla.ac.uk> wrote:
Greetings.
(catching up ...)
On 29 Apr 2016, at 17:58, John Cowan wrote:
On Fri, Apr 29, 2016 at 8:54 AM, Norman Gray <norman@astro.gla.ac.uk>
wrote:
In the XML world, the DTD is just for validation
That turns out not to be the case. There are a number of XML DTD
features
which affect the infoset returned by a compliant parser. If they are
in
the internal subset, the parser MUST respect them;
I stand corrected; I was sloppy. I think this doesn't change my
original point, however, which was that in SGML the DTD was integral to
the document, and to the parse of the document, and that it's easy to
forget this after one has got used to two decades of XML[1]. I can't
remember if there was a trivial or default DTD which was assumed in the
absence of a declared one, in the same way that there was a default SGML
Declaration, but taking advantage of that would probably have been
regarded as a curiosity, rather than normal practice.
In XML, in contrast, the DTD has a more auxiliary role, and at a first
conceptual look, that role is validation (even though -- footnote! -- it
may change other things about the parse as well). Thus _omitting_ an
XML DTD (or XSchema) is neither perverse nor curious.
Practical aspect: When I'm writing XML, I use a DTD (in whatever syntax)
to help Emacs tell me if the document is valid, but I don't even know
whether the XML parsers I use are capable of using a DTD external
subset. That careless ignorance would be impossible with SGML.
The rational extension of that attitude, of course, is MicroXML, which
(as you of course know) doesn't use any external resources at all, and
doesn't care about validation.
Best wishes,
Norman
[1] Hang on, _two_ decades?! I've just checked and ... 1996 doesn't
seem that long ago.
--
Norman Gray : https://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php