OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   "Datatypes" for DTDs

[ Lists Home | Date Index | Thread Index ]

"Wayne Steele" <xmlmaster@hotmail.com> wrote:
|From: Sam Hunting <shunting@etopicality.com>
|> [Wayne Steele <xmlmaster@hotmail.com>]

|>> One trick that people have used for ages to indicate data types in  
|>> DTDs is through parameter entities. [...] This works great for human 
|>> documentation, but is not especially machine processable.

Right.  Mnemonic PE names have always been a form of handwaving. :-)

|>> I wonder if it's worth more thought to try and bless a way of doing 
|>> this, to provide more meaning to attributes?

Depends on how.  One method is already standardized: the DATA declared
value.  See K.4.4.2 and K.4.4.3 in


(which as it happens, can work with DTDs already using the PE placeholder
kludge by simply redefining the replacement text of the PEs!)

|> One way is at http://www.w3.org/TR/dt4dtd ("Datatypes for DTDs (DT4DTD) 1.0").
| I'm familiar with DT4DTD, and I think it's a great piece of work.

It's nice as a workaround for deficient syntax, but there are problems
with it (not the least of which is that a workaround would be dubious when
a standardized syntax already exists.)

One is hardwiring names like "e-dtype" and "a-dtype" into what is supposed
to be a generic mechanism.  Also needed is a mechanism to *declare* the
names with such associative semantics - IOW, a generic processor shouldn't
have to care about the particular names as long as it can find out what to
look for, without any possibility of clashing with the application's names
(or to put it another way, why should an application be prevented from
using names like e-dtype and a-dtype for its own purposes when it *also*
wants to use this facility?  It's a first principle of the formalism that
applications be free to choose their own names.) 

Another is that only one associative attribute suffices if its declared
value is going to be CDATA - use a reserved word like #CONTENT (or even
#PCDATA!) as the proxy name of the unnamed attribute, text content, in the
list of pairs in the a-dtype attribute.  This is sort of analogous to an
extension of DATA declared value syntax:

  <!ATTLIST  foo
             #PCDATA  DATA  bar  #IMPLIED

A third is environmental - there's an assumption that an application has
access to declarative information "after the fact".  Interfaces like ESIS,
for example, will not report unreferenced notations, and are silent on
whether an application can query for them.  There is a hack available to
"pull in" notation declaration information as part of the parsing process,
using placeholder external entity declarations, but this increases the
overhead and really ruins the simplicity of the original idea, IMHO!

Lobby for the DATA declared value.  It solves the problem directly.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS