OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] The privilege of XML parsing - Data types,binary XML

[ Lists Home | Date Index | Thread Index ]

Very  interesting discussions - thanks!

I would like to summarize the discussion (as I understand it) and pose a question.

First, here are the lessons that I have learned:

[Note: below I elaborate upon how I came to these lessons.]

1. The XML string represents the "data".  Data endures over time.

2. Data models provide metadata of the XML string.  Metadata (data models) changes
form and content over time.  Data models should be considered "disposable".

3. Applications process the data.  Applications may or may not use the data
models.  Applications come and go.  Applications should be considered "disposable".

4. Don't hardcode a binding between the XML string and a data model.  The binding
should be done dynamically, and as desired.

Question: should these lessons be considered as Best Practice, i.e., that data
models and applications should be treated as "disposable" and that instance
documents should be free from references to a data model?

Now for the summary.  From the discussions I have distilled three parts that are
involved with applications processing XML:

1. The raw XML string.
2. Data models of the raw XML string.
3. Applications that process the XML.

Let's consider each of these in turn:

(1)  The raw XML string.  For example:

<?xml version="1.0" encoding="UTF-8"?>
<aircraft>
    <altitude>12000</altitude>
</aircraft>

In this form all we have is string data - 12000 is just a string.  When parsed, an
undecorated parse tree is created (or, equivalently, a sax event stream produces a
sequence of string tokens).

(2)  Data models of the raw XML string.  There are many ways to model the XML.
Some sample data models are:

An XML Schema Data Model of the aircraft:

- the  XML Schema data model
      . declares that the aircraft element is comprised of an altitude element.
      . The altitude element in comprised of an integer that is restricted to the
range 0-20000.

An RDF Schema Data Model of the aircraft:

- the RDF Schema data model
      . aliases aircraft and plane,
      . states that aircraft is a subclass of "FlyingMachine", and
      . constrains altitude to an integer that is restricted to the range 0-20000

Some things to note about data models:

a. A data model provides data about the data in the XML string, i.e., metadata.

Example: The above two data models provide this metadata:
      -  the value of the altitude element represents an integer, whose value
should be in
         the range 0-20000,
      - the term "aircraft" and "plane" are synonymous, and
      - the aircraft is an instance of a "FlyingMachine".

Another Example: Consider a library.  Each book in the library represents "data".
The card catalogue "models" all the data (books) in the library.  The card
catalogue provides data about each book in the library.  Thus, the card catalogue
"data model" provides metadata.

b. Data models may be usefully utilized by applications.

Example.  A Purchase Order schema (data model) specifies the valid format for a
PO.  An application may use the Purchase Order schema to validate a PO XML string.

c. An XML string may be explicitly bound to a data model or not.

Example of an XML string  that is explicitly bound to a data model:

<?xml version="1.0" encoding="UTF-8"?>
<aircraft xmlns="http://www.FAA.org";
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
               xsi:schemaLocation=
                          "http://www.FAA.org
                           aircraft.xsd">>
    <altitude>12000</altitude>
</aircraft>

Here the XML string has been explicitly bound to an XML Schema data model -
aircraft.xsd.

Example of an XML string that is not bound to a data model:

<?xml version="1.0" encoding="UTF-8"?>
<aircraft>
    <altitude>12000</altitude>
</aircraft>

Here the XML string makes no association to a data model.

d. An application may dynamically bind an XML string to a data model.

For example, with xerces you can dynamically associate an XML Schema (data model)
to an XML string by specifying a pair of values: { targetNamespace, URL-to-schema
}.

e. Over time the data model may change.

Example.  Consider the library example above.  Over time the card catalogue (i.e.,
the data model) may change:
                     - it may be changed from paper form to electronic form, or
                     - the data provided for each book may change.  For example, we
may
                       change the filing system from Dewey Decimal to Library of
Congress

LESSON LEARNED: data models are "disposable".

Notice that the books in the library (i.e., the data) endure over time, while the
card catalogue (i.e., the data model) changes.

LESSON LEARNED: Data (i.e., the XML string) endures over time.

f. Note that when a data model is "applied" to the parse tree from (1) then the
parse tree is "decorated" with things such as:

- the aircraft node is decorated with an alias "plane"

- the aircraft node is decorated with a "subclass of FlyingMachine"

- the altitude node is decorated with datatype="integer" restriction (0, 20000)

(3) Applications process the XML.  Applications should be free to use just the
undecorated parse tree (that is, the parse tree resulting from parsing the XML
string that has no association with a data model). Several examples were presented
to demonstrate the value of an application utilizing the undecorated parse tree.
Alternatively, applications may choose to use the decorated parse tree (that is,
the parse tree that results from "applying" a data model).

LESSON LEARNED: Applications process the data.  Applications may or may not use the
data models.  Applications come and go.  Applications should be considered
"disposable".

LESSON LEARNED: Don't hardcode a binding between the XML string and a data model.
The binding should be done dynamically, and as desired.

This discussion all started with Sean's message about keeping the XML string free
from ties to a data model, and being able to do processing in a pipeline fashion.
Note that the above supports Sean's pipelining methodology.

Comments?  /Roger






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS