OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] SGML default attributes.

If you look at an environment like PubMed, where you have many
loosely-governed and coordinated publishers contributing documents that
are all supposed to be in the same document type you start to see the
problem. They are all (or mostly all) DTD valid. Many claim to be valid to
the JATS or NLM DTD, but they have modified these DTDs locally. Many have
modified the DTDs and given them new public IDs (as they should) but then
there's no way to know what they really are. It's a mess that the
engineering team at PubMed went to heroic efforts to try to manage through
transforms and so forth. It's a direct result of thinking that the DTD
means something in terms of conformance to some larger document type as
well as the lack of controlled extension mechanism.

If you're just creating documents for yourself or only within the scope of
a small organization or enterprise then of course the problem doesn't

But if you're building a content management system or a hypertext system
or a publishing system like PubMed or a general-purpose authoring
environment or you're loose confederation of a 100 enterprises like IBM
then it's a big problem.

Of course, I could also say that at the end of the day it all comes down
to trust and no amount of technology can completely eliminate the need for
trust and communication.

DITA's not adding indirection--there's no practical expectation that
processors will go from @domains values to grammars for validation.
There's still a practical expectation that documents will have associated
grammars correctly configured with the appropriate modules' contributions.

What DITA's doing is saying "the definition of the abstract document type
and the essential DITAness of the document" is completely divorced from
any grammars used for validation. And in particular, we don't care *at
all* what public ID or URI you happen use for your external DTD subset or
XSD or RELAX NG grammar because it doesn't matter. As long as I can
resolve the reference I can validate and as long as I get @domains and
@class attributes I can understand and process the content.

And just to be clear: because DITA depends on all these magic attributes
it's also a practical requirement to have grammars that provide attribute
defaults. So in practice DITA documents for authoring have grammars
because why would you not. My point is that they are not *required* and
even then they are used *we don't care what strings you use to address the
storage objects that contain them*.



Eliot Kimber, Owner
Contrext, LLC

On 5/4/16, 5:00 PM, "Peter Flynn" <peter@silmaril.ie> wrote:

>On 05/04/2016 10:41 PM, Eliot Kimber wrote:
>> I think you're missing my point: it's not that DTDs aren't useful for
>> authoring and other things, of course they are, and if you use a DTD to
>> guide your authoring then everything is good (as long as you're using
>> correct set of declarations).
>> What I'm talking about is the processing context where you get documents
>> from somewhere and need to determine:
>> 1. What is the (abstract) document type this document claims to conform
>> 2. Is the document valid against that document type?
>I must confess to never having been worried by [1], only [2].
>> It is this use case that DTDs, by themselves, are not sufficient for,
>> the simple reason that the reference in the DOCTYPE declaration to some
>> external DTD subset is not a reliable indicator, for all the reasons
>> given.
>I take your point, but in the last 30 years I have been fortunate not to
>have been given any documents that were that perversely incorrect, as
>you exampled. Lots that were broken, some fatally, but that was from
>non-conformance with the DTD, not because they were pretending to be
>something they were not.
>> It is absolutely a fact today that many groups will use a public ID for
>> some standard DTD, e.g., DocBook, and then map that public ID to some
>> of declarations that they have modified in all kinds of ways. This
>> *all the time* and it is *wrong*. It is a lie.
>True. I stopped pointing this out to people a long time ago. Probably
>when the GCA dropped the ball on ISO 9070 shortly after I registered
>Silmaril as a FPO :-)
>> My point is that the reference to the external DTD subset is not itself
>> sufficiently reliable, in the general, to tell you what the document
>> is.
>I must be lucky, then.
>> DITA addresses the breakage by saying "I don't care what actual DTD file
>> you use (or don't use), I care about what you claim about the vocabulary
>> used by the document."
>It might be more accurate to say that DITA breaks the conventional XML
>processing model by using additional layers of indirection in order to
>overcome the inexactitude of vocabulary-definition references.
>> As we've all said--at the time (1986) it was the best we had and it's
>> understandable that the conceptual mistake was made. But the problem was
>> (or should have been) glaringly obvious by the time we did XML--after
>> we recognized in XML that documents can be perfectly useful with no
>> explicit grammar association whatsoever.
>Well-formed documents are a convenience for interoperation on the basis
>of trust, no more.
>XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>to support XML implementation and development. To minimize
>spam in the archives, you must subscribe before posting.
>[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>subscribe: xml-dev-subscribe@lists.xml.org
>List archive: http://lists.xml.org/archives/xml-dev/
>List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS