OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Namespaces,W3C XML Schema (was Re: ANN: SAX FiltersforNamespaceProcessing)

You touch on some key points in a large debate about how to process XML.
I'll try to put across my point of view.

Firstly object-oriented languages such as Java and C++ are in widespread
use. When converting from a markup language to an object-oriented language,
it is natural to group together the functionality (and data) coming from the
use of a tag and put it in its own class. With Java, you have to since all
code is in some class.

For example, look at the API for Microsoft's implementation of HTML:
(this URL may split over several lines).

You will see that for every HTML element, there is a class (or more
precisely an interface to a class). Most tools that implement specifications
(such as SOAP, WSDL and XML Signature) have low-level APIs containing
classes related to each of the elements. Also it seems natural for a class
representing an element to have connections with classes representing child
and parent nodes. This generally leads to a tree-like object model.

As far as I can see, code from an object-oriented language that holds
complete information from an XML document and can perform actions will use
some system of objects. You can either build up from SAX or "cherry-pick"
from a DOM, but the more complete the information you require, the more of
an object model you create. If there is an alternative way, I am keen to
have it explained to me, as I've never grasped it.

When XML comes with a schema, I would suggest that there is a link between
the schema and the model of the objects to create. A schema states
connections between parent and child nodes. It states which elements can be
top-level, and so on. So an application that inspects a schema and spits out
the beginnings of an appropriate object model should be possible.

When looking at XML Schema, there are 4 primary components (and 13 overall):
simple and complex type definitions, and element and attribute declarations.
Examining the use of each component, the element and attribute declarations
become the elements and attributes of the actual XML document, so mapping to
a class (as I explained before) is natural in many situations. On the other
hand, mapping to a string or an integer may be valid in other situations. Or
several elements could be mapped to one class.

I agree that simple types have one function: that of constraint checking.
Complex types have an additional function: that of allowing or disallowing
child nodes. In both cases this functionality can easily be grouped together
into a class for each different simple or complex type. So the idea of types
represented as classes is also not that radical. However the idea of
subclassing for extensions or restrictions is only good if it works and
saves effort, and often it does not. Similarly inheriting when the complex
type contains a group is an awkward and possibly futile exercise. (This
whole paragraph is IMHO.)

To sum up, I think that generating classes containing functionality from
schemas could be a useful first step in creating implementations of XML
vocabularies using object-oriented languages. However the generation has got
to be well thought out, and it has got to be flexible.


----- Original Message -----
From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
To: Xml-Dev <xml-dev@lists.xml.org>
Sent: Wednesday, August 22, 2001 4:28 PM
Subject: Re: Namespaces, W3C XML Schema (was Re: ANN: SAX

> At 10:25 PM -0700 8/21/01, Ronald Bourret wrote:
> >Elliotte Rusty Harold wrote:
> >
> >> I am concerned that the theoretical use of schemas for typing is
> >> overriding their practical use for constraint checking.
> >
> >The use isn't theoretical. Witness all the products that generate
> >classes from XML Schemas.
> >
> And witness all the people using these products NOT. I classify this stuff
along with tree-based XML editors and binary variants of XML as something
that gets reinvented several times a month without any actual market demand.
> On the other hand, over the last three years as I've taught developers
about DTDs, almost invariably the first question is "How do I say that an
element contains an int?" and the second question is usually ""How do I say
that an element contains a year since 1969?" or some variant thereof.
> >> Very few people
> >> are actually using schemas for typing. Instead they're being used for
> >> validation.
> >
> >I think it depends on how you do the counting. Clearly, the number of
> >people validating schemas outnumbers the number of people writing code
> >that explores them. This is a restatement of the fact that the number of
> >document authors is greater than the number of programmers.
> >
> >If you count applications, validators are a minority.
> I count people as worth more than programs.
> >
> >> In validation, we need local types (if not necessarily unqualified
> >> local types) because the W3C XML Schema Language confuses the two
> >> separate issues of typing and constraints checking, especially when
> >> it comes to complex types. I don't want to see any prohibition on
> >> local types enshrined as a best practice, or otherwise deprecated.
> >
> >Could you explain this further? Isn't constraints checking either (a)
> >the checking of data against types, or (b) the definition of domains for
> >a given type? (I suppose this also depends on what you mean by "type".)
> >
> If you view restrictions as subclasses, then you can do it all as typing.
However, this is problematic in a lot of the naive approaches people are
taking. For instance, what's an appropriate subclass of int in Java or C++?
For just one recent example, consider this statement from this thread:
> "Local types may be used to specify elements with the same name but
> different types in different content models. Use of this feature makes
> it simpler to write complex schemas which will be processed by
> schema-specific processors. However it may also make it harder to
> process the data with general purpose processors such as presentation or
> editing tools.
> Why would local types make life harder for general purpose processors? If
a presentation or editing tool is presented with a local restriction of a
type, why can't it work with that? If it can't work with that, why can't it
use an editor for the base type?
> --
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
> |              http://www.ibiblio.org/xml/books/bible2/              |
> |   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
> +----------------------------------+---------------------------------+
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>