OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

The First 2 Layers Problem and a tentative solution

"Layers" refers to recent discussions of layered architectures for XML 
processing.  I quote from Eric van der Vlist  [1]; see also Michael 
Champion [2]:

...we have already at least 5 (optional) layers (more if you add XLink and 
XInclude): well formed XML before entity substitution, well formed XML 
after entity substitution, validated XML, namespaced XML and typed XML

There is a persistent awkwardness about the first two layers resulting from 
DTDs doing double duty: they both validate structure and (speaking 
anachronistically) fill in the Infoset, primarily by resolving general 
parsed entities and providing default attribute values.  While attribute 
defaults can be avoided, general parsed entities provide a genuinely useful 
functionality, in my opinion.  The problem is that, in order to make use of 
it, one needs a DTD and either a non-validating parser that resolves 
entities (which they are not required to do) or a validating parser and DTD 
validation.  The latter option messes up layers, brings in all the problems 
with Namespaces, and results in duplication of effort if we then proceed to 
post-validate with e.g., TREX.  I wonder if the following small steps can 
help alleviate the problem:

1.  Explicitly deprecate attribute defaults.

2.  Make a small change in the Conformance section of XML 1.0 requiring 
non-validating processors to expose entity resolution as a (settable) 
feature (in API terms, in addition to isValidating() and setValidating() 
provide isEntityResolving() and setEntityResolving()).

3.  Recommend the following usage of DTDs:

<!DOCTYPE docType [
<!ELEMENT docType ANY>
<!-- declare general parsed entities -->

4.  Use the DTD to resolve entities, either using a non-validating 
entity-resolving parser or a validating parser with validation on.

5.  Do structural validation on the Namespaced XML using, e.g., TREX.  On 
the same pass, if needed, proceed to typed XML (by which I assume we mean 
simple types only).

This will clean up the layers a little bit, and all of it can be done today.

[1] http://lists.xml.org/archives/xml-dev/200103/msg00566.html
[2] http://lists.xml.org/archives/xml-dev/200103/msg00625.html
Alexander Nakhimovsky   tel 315-228-7586
Computer Science Dpt    fax 315-228-7004
Colgate University   sasha@cs.colgate.edu
Hamilton NY 13346    sasha@mail.colgate.edu