XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] SGML default attributes.

Eliot,

In order to avoid potential misunderstandings, I think it might be worth clarifying your position on the following points:

(1) Resolved: the whole idea of entity identity was a mistake, is worthless, and is evil.

(2) Resolved: the whole idea of document type identity was a mistake, is worthless, and is evil.

I have deliberately made these statements extreme and obviously silly in order to dramatize the fact that, even though there are problems with SGML's and/or XML's operational approaches to them, we cannot discard these ideas altogether. The ideas themselves remain profound and necessary. They will always be needed. The usefulness of their various operational prostheses will always be limited to certain cultural contexts. Even within their specific contexts, those prostheses will always be imperfect. They will always require occasional repair and replacement, in order that they remain available for use even as that context's notions of "entity", "document", and "identity" continue to evolve and diversify.

The operational prostheses with which these ideas were fitted at SGML's birth are things of their time. That was then, this is now, and "time makes ancient good uncouth". Their goodness in their earlier context is a matter of record; they were used, a lot, for a lot of reasons and in a lot of ways. At the time, it was not stupid or evil to make the notion of document type identity depend on the notion of entity identity, nor was it stupid or evil to make the notion of entity identity dependent on PUBLIC identifiers. And in many ways, it still isn't. What is your proposed alternative, and why is it better?

Steve

On 05/04/2016 11:23 AM, Eliot Kimber wrote:
SGML requires the use of a DTD--there was no notion of a "default" DTD.
This requirement was, I'll argue, the result of a fundamental conceptual
mistake--understandable at the time but a mistake nevertheless.

The conceptual mistakes that SGML made was conflating the notion of an
abstract "document type" with the grammar definition for (partially)
validating documents against that document type. That is, SGML saw the DTD
as being equal to the definition of the "document type" as an abstraction.
But of course that is nonsense. There was (remains today) the misguided
notion that a reference to an external DTD subset somehow told you
something actionable about the document you had. But of course it tells
you nothing reliable because the document could define it's "real" DTD in
the internal subset or the local environment could put whatever it wants
at the end of the public ID the document is referencing.

Consider this SGML document:

<!DOCTYPE notdocbook PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" [
   <!ELEMENT notdocbook ANY >
   <!ELEMENT bogus ANY >
]>
<notdocbook>
   <bogus><para>This is not a DocBook document</para></bogus>
</notdocbook>

This document will be taken as a DocBook document by any tool that thinks
the public ID means something. But obviously it is not a DocBook document.
It is, however, 100% DTD valid. QED DTDs are useless as tools of document
type definition. The only reason the SGML (and now XML world) didn't
collapse under this fact is that the vast majority of SGML and XML
authoring and management tools simply refused to preserve internal subsets
(going back to the discussion about DynaBase's problems with entity
preservation).

Standoff grammars like XSD and RELAX NG at least avoid the problem of
internal DTD subsets but they still fail to serve as reliable definitions
of document types in abstract because they are still only defining the
grammar rules for a subset of all possible conforming documents in a
document document type.

Because of features like tag omission, inclusion exceptions, and short
references, it was simply impossible to parse an SGML document without
having both its DTD and its SGML declaration (which defined the lexical
syntax details). There is a default SGML declaration, but not a default
DTD.

A lot of what we did in XML was remove this dependency by having a fixed
syntax and removing all markup minimization except attribute defaults.

XML does retain one markup minimization feature, attribute defaults.
Fortunately, both XSD and RELAX NG provide alternatives to DTDs for
getting default attribute values.

Cheers,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 5/4/16, 6:16 AM, "Norman Gray" <norman@astro.gla.ac.uk> wrote:

Greetings.

(catching up ...)

On 29 Apr 2016, at 17:58, John Cowan wrote:

On Fri, Apr 29, 2016 at 8:54 AM, Norman Gray <norman@astro.gla.ac.uk>
wrote:

In the XML world, the DTD is just for validation


That turns out not to be the case.  There are a number of XML DTD
features
which affect the infoset returned by a compliant parser.  If they are
in
the internal subset, the parser MUST respect them;
I stand corrected; I was sloppy.  I think this doesn't change my
original point, however, which was that in SGML the DTD was integral to
the document, and to the parse of the document, and that it's easy to
forget this after one has got used to two decades of XML[1].  I can't
remember if there was a trivial or default DTD which was assumed in the
absence of a declared one, in the same way that there was a default SGML
Declaration, but taking advantage of that would probably have been
regarded as a curiosity, rather than normal practice.

In XML, in contrast, the DTD has a more auxiliary role, and at a first
conceptual look, that role is validation (even though -- footnote! -- it
may change other things about the parse as well).  Thus _omitting_ an
XML DTD (or XSchema) is neither perverse nor curious.

Practical aspect: When I'm writing XML, I use a DTD (in whatever syntax)
to help Emacs tell me if the document is valid, but I don't even know
whether the XML parsers I use are capable of using a DTD external
subset.  That careless ignorance would be impossible with SGML.

The rational extension of that attitude, of course, is MicroXML, which
(as you of course know) doesn't use any external resources at all, and
doesn't care about validation.

Best wishes,

Norman


[1] Hang on, _two_ decades?!  I've just checked and ... 1996 doesn't
seem that long ago.


--
Norman Gray  :  https://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php







[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS