OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Come On, DTD, Come On! Thoughts on DSDL Part 9

[ Lists Home | Date Index | Thread Index ]

John Cowan <jcowan@reutershealth.com> wrote:

| 1) The NS declaration. 

It's hard to close a can of worms after it has been opened.  

| 2) Attribute data types.  The names that can appear in an ATTLIST
| declaration directly after an attribute name are extended to include
| the datatype names of part 5 (i.e. XSD simple types).

As staed, emphatically no.

The WebSGML TC provides for DATA declared values.

 <!NOTATION integer PUBLIC "whatever" >

 <!ATTLIST foo
           bar  DATA integer  #IMPLIED >

The full syntax allows the data content notation ('DATA integer') to be
qualified with data attributes (an attribute specification list within a
pair of '[' and ']' delimiters).  Note that the DATA keyword automatically
provides for extensibility in that the notation name ('integer') is "user
defined" in a separate declaration.  Your proposed syntax would be the
same as eliminating the DATA keyword if at the same time the already
existing reserved words ('NMTOKEN', 'CDATA', 'ENTITY', etc) were now
required to have the Reserved Name Indicator ('#') prefixed, eg

  <!ATTLIST foo
            quux  #NMTOKEN  #IMPLIED
            bar   integer   #IMPLIED >

| 3) Element simple datatypes.  Likewise, unparenthesized content models
| in ELEMENT declarations are extended from just ANY and EMPTY to include
| these same datatypes.
| Example: <!ELEMENT foo nonNegativeInteger>

I actually put up a straw proposal to this effect on comp.text.sgml a
while back.


As long as the syntax is the same as the extension for ATTLISTs, I think
it can work.  (Note again that your syntax would require reserved words
like ANY and EMPTY to now be #-prefixed.)
| 4) Datatype lists.  In either #2 or #3 context, a simple datatype name
| can be replaced by "LIST(name)" to indicate a whitespace-separated
| list of strings matching the datatype.	IDREFS is equal to LIST(IDREF),
| and ENTITIES is equal to LIST(ENTITY).

Is this definitional, or a means to specify a list of user defined names?
I'm not seeing the greater utility of a literal 'LIST(IDREF)' over a plain
'IDREFS'.  In the other case, why do we need the 'LIST' prefix when the
parens provide enough syntactic marking?  

| 5) Datatype choice.  In either #2 or #3 context, a simple or LIST-wrapped
| datatype name can be replaced by |-separated names, to indicate a choice
| (derivation by union in WXS terms).
| Example: <!ELEMENT bar integer|name>
| Issue: what do we do about XSD facets?	They are important but don't
| easily fit into the rigid DTD syntax.

Extend notation declarations and allow data attributes, as per my straw
proposal, to have derivation heirarchies.
| 6) Restore & connector. [...] Issue: SGML or interleave?  My answer: interleave

| 7) Abandon SGML 1-ambiguity rules. 

Agreed.  The SGML 1-ambiguity rules are only the result of an ad hoc
approach to OMITTAG inference problems.  When an instance is at least
amply tagged, validation becomes a matching problem only, not a predictive
parsing problem any more.

| 8) Restore multiple element and attribute names separated by |s.

I'd prefer a whitespace-separated list of tokens within parens.  In fact
I'd like this for all name group and nametoken group usages, instead of an
infix separator.

| 9) Fixed element content.  Allow ELEMENT declarations to specify "#FIXED
| 'value'" after a datatype.

| General issue: Should there be some way to indicate candidate roots?
| In existing DTDs, any element can be a root.

Why is this a problem?  I admit I've never understood the issue: is this
deference to the common fallacy of viewing the FPI of an external subset
as "declaring a doctype"?  

| General issue: We need to figure out what to do if the instance contains
| an internal DTD (by which I mean an internal subset, a reference to an
| external subset, or both).  Should internal validation be required,


| permitted, 


| or forbidden when doing external validation?

Probably irrelevant.  The contents of a document type declaration are
specific to an instance.  Validation with respect to a fixed set of
declarations is a separate exercise (as in ArchForms).  The issue would be
how to declare that fixed set.


Don't make it more complicated than necessary. 
 - Steve Slatcher


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS