xml-dev - Re: XML Schemas: Best Practices

Re: XML Schemas: Best Practices
[ Lists Home | Date Index | Thread Index ]
From: "Roger L. Costello" <costello@mitre.org>
To: "John F. Schlesinger" <johns@syscore.com>
Date: Sat, 23 Sep 2000 14:31:36 -0400
I don't believe that this is correct John.  Let's consider my new
(bug-free) version of Camera.xml:
(see http://www.xfront.com/BestPractices.html)

     <?xml version="1.0"?>
     <my:camera xmlns:my="http://www.camera.org"
              xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
              xsi:schemaLocation=
                        "http://www.camera.org
                         Camera.xsd">
         <body>
             <description>Ergonomically designed casing for 
                          easy handling</description>
         </body>
         <lens>
             <zoom>300mm</zoom>
             <f-stop>1.2</f-stop>
         </lens>
         <manual_adapter>
             <speed>1/10,000 sec to 100 sec</speed>
         </manual_adapter>
     </my:camera>

If I were to use a default namespace declaration, as you suggest:

     <?xml version="1.0"?>
     <camera xmlns="http://www.camera.org"
              xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
              xsi:schemaLocation=
                        "http://www.camera.org
                         Camera.xsd">
         <body>
             <description>Ergonomically designed casing for 
                          easy handling</description>
         </body>
         <lens>
             <zoom>300mm</zoom>
             <f-stop>1.2</f-stop>
         </lens>
         <manual_adapter>
             <speed>1/10,000 sec to 100 sec</speed>
         </manual_adapter>
     </camera>

then my instance document is asserting that all the elements belong to
the namespace: http://www.camera.org.  If you look at the schema you
will see that description, zoom, f-stop, and speed all belong to
different namespaces (the camera schema imports elements from three
different namespaces).  Thus, a default namespace would not work in this
case.  Agree?  /Roger


"John F. Schlesinger" wrote:
> 
> If you want to hide namespaces in the instance, you can do a little better
> than
> the example shown by using the targetNamespace in the <schema> as the
> default namespace
> in the instance.
> 
> Roger's example becomes:
> <?xml version="1.0"?>
> <camera xmlns=http://www.camera.org … >
>         <body>Ergonomically designed casing for easy handling</body>
>         <lens>300mm zoom, 1.2 f-stop</lens>
>         <manual_adaptor>1/10,000 sec to 100 sec</manual_adaptor>
> <camera>
> 
> No a big change, but it eliminates the need even to prefix the document root
> element.
> 
> I found this useful for development teams that had understood XML but not
> namespaces and schemas.
> The idea was for the XML engineer to do the schema and for the development
> teams to produce valid
> instances or to validate other teams' instances using a schema validator.
> 
> Note that there is a difference between prefixing elements and qualifying
> them (which is why this
> works) which is one of the reasons for constructing schemas so that the
> developers don't have to
> wrestle with the difference.
> 
> If there are name collisions, namespaces have to be used.
> 
> Yours,
> John
> 212 319 5327 Home
> 917 886 5895 Mobile
> 
> -----Original Message-----
> From: Roger L. Costello [mailto:costello@mitre.org]
> Sent: Friday, September 22, 2000 10:45 AM
> To: xml-dev@lists.xml.org
> Cc: costello@mitre.org; Cokus,Michael S.; Pulvermacher,Mary K.;
> Heller,Mark J.; JohnSc@crossgain.com; Ripley,Michael W.
> Subject: Re: XML Schemas: Best Practices
> 
> Hi Folks,
> 
> I am delighted to see the responses to my last message.  Clearly people
> are thinking about this issue and have strong feelings about hiding
> namespace complexities in the schema versus making namespaces explicit
> in instance documents.  This is good!  Now let's see if we can distill
> out some general guidelines on when to hide and when to make explicit.
> 
> Based upon some of the responses I can see that I did not do a very
> satisfactory job in motivating when you would want to hide the namespace
> complexities.  So let's quickly address that again, and then move on to
> guidelines for when it is desirable to make namespaces explicit in
> instance documents.
> 
> Recall the camera example that was presented.  By designing the schema
> so that body, lens, and manual_adaptor are children of camera (i.e.,
> local elements), and by setting elementFormDefault="unqualified" we
> enable the creation of a class of instance documents that are pretty
> straightforward to read and write.  An example of one instance document
> was presented:
> 
> <?xml version="1.0"?>
> <my:camera xmlns:my=http://www.camera.org … >
>         <body>Ergonomically designed casing for easy handling</body>
>         <lens>300mm zoom, 1.2 f-stop</lens>
>         <manual_adaptor>1/10,000 sec to 100 sec</manual_adaptor>
> <my:camera>
> Recall that the schema imported the declaration of the body element from
> the nokia schema, the lens element from the olympus schema, and the
> manual_adaptor element from the pentex schema.  Looking at the instance
> document above one would never realize this.  Such complexities are
> localized to the schema. Thus, we say that the schema has been designed
> in such a fashion that its complexities are "hidden" from the instance
> document.
> 
> Several people responded to this design approach arguing that they
> believe that it is good and perhaps necessary to qualify body, lens, and
> manual_adaptor.  Below I show the instance document with all elements
> qualified with a namespace:
> 
> <?xml version="1.0"?>
> <my:camera xmlns:my="http://www.camera.org"
>               xmlns:nikon="http://www.nokia.com"
>               xmlns:olympia="http://www.olympia.com"
>               xmlns:pentex=http://www.pentex.com …>
>         <nikon:body>Ergonomically designed casing for easy
>                     handling</nikon:body>
>         <olympia:lens>300mm zoom, 1.2 f-stop</olympia:lens>
>         <pentex:manual_adaptor>1/10,000 sec to
>                      100 sec</pentex:manual_adaptor>
> <my:camera>
> 
> This instance document makes explicit that the body element comes from
> the nikon namespace, the lens element comes from the olympia namespace,
> and the manual_adaptor element comes from the pentex namespace.
> 
> Thus, we come to two fundamental questions:
> 
> [1] When does it make sense to design a schema to hide the namespace
> complexities from instance documents?
> 
> [2] When does it make sense to design a schema to force instance
> document to make explicit the namespaces of its elements?
> 
> The later question will be answered in the next section.  For now, let's
> try to characterize the systems for which it makes sense to hide the
> namespace complexities in the schema.
> 
> As I compare the two versions of the instance documents above the first
> thing that strikes me is the difference in readability.  The first
> version is much easier to read.  The namespaces in the second version -
> both the namespace declarations and the qualifiers on each element - are
> very confusing to an average fellow like myself.
> 
> So, I come to the first characteristic:
> 
> "For systems where readability is of utmost importance design the schema
> to hide the namespace complexities."
> 
> I can well imagine writing an application to process the camera instance
> document such that it (the application) does not care what namespace the
> body element comes from, what namespace the lens element comes from, or
> what namespace the manual_adaptor element comes from.  Such complexities
> are irrelevant to the application.  The application just cares that the
> camera element contains a body element with the proper type of data, a
> lens element with the proper type data, and a manual_adaptor element
> with the proper type data.  Knowledge of the namespaces that the body,
> lens, manual_adaptor elements belong to provides no additional
> information to the application. At the very best, the namespaces are a
> distraction to the application. If at some point the application does
> find it necessary to know what namespace an element is associated with
> then it will simply look it up in the schema.
> 
> This brings me to the second characteristic:
> 
> "For systems where knowledge of the namespaces of the elements provide
> no additional information design the schema to hide the namespace
> complexities."
> 
> Those are the two characteristics that I see.  Do you see any further
> characterizing features?
> 
> Before moving on to when it makes sense to make the namespaces explicit
> in instance documents, I would like to pause and address Richard
> Lanyon's concern.  Richard's concern is (paraphrasing):
> 
> "Okay Roger, let's suppose that it makes sense to localize the
> complexities to the schema.  An author of an instance document will
> still have to read the schema, and understand it, to write the instance
> document.  Correct?  How have we hidden the complexities of the
> schema?"
> 
> Let me see if I can address this concern satisfactorily:
> 
> [1] An instance document is written once but processed by many systems
> (write once, read many).  All those systems which process the document
> are shielded from the complexities of the schema.
> 
> [2] In the not-too-distant future there will be tools that read schema
> and provide a template for the instance document author to fill in.  The
> tool will understand the schema and shield the author from needing to
> understand the schema.
> 
> I hope that answers your concern satisfactorily Richard.  If anyone else
> has anything to add to this please join in.
> 
> Now let's move on to characterizing those systems for which it makes
> sense to design a schema to force instance document to make explicit the
> namespaces of its elements.
> 
> First recall the techniques a schema uses to force instance documents to
> expose the namespaces of its elements.
> 
> [1] Use elementFormDefault="qualified" to Force the Use of Namespace
> Qualifiers
> 
> Len Bullard sketched out a schema for a 3D rendering system.  Let me
> refer to that as the "video-game" schema.  Let's see how to design that
> schema so that it forces instance documents to use namespace qualifiers
> on its elements:
> 
> <?xml version="1.0"?>
> <schema xmlns="http://www.w3.org/1999/XMLSchema"
>         targetNamespace="http://www.video-game.org "
>         elementFormDefault="qualified"
>         xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
>         xsi:schemaLocation=
>                         "http://www.w3.org/1999/XMLSchema
>                          http://www.w3.org/1999/XMLSchema.xsd"
>         xmlns:design-works="http://www.design-works.com"
>         xmlns:disney="http://www.disney.com"
>         xmlns:mci=http://www.mci.com>
>     <import namespace= http://www.design-works.com
>                   schemaLocation= "DesignWorks.xsd"/>
>     <import namespace= http://www.disney.com
>                   schemaLocation= "Disney.xsd"/>
>     <import namespace= http://www.mci.com
>                   schemaLocation= "MCI.xsd"/>
>     <element name="video-game">
>         <complexType>
>             <sequence>
>                 <element ref="design-works:geometry" minOccurs="1"
>                                 maxOccurs="1"/>
>                 <element ref="design-works:lighting" minOccurs="1"
>                                 maxOccurs="1"/>
>                 <element ref="disney:character" minOccurs="1"
>                                 maxOccurs="1"/>
>                 <element ref="mci:voice" minOccurs="1"
>                                 maxOccurs="1"/>
>             </sequence>
>         </complexType>
>     </element>
> </schema>
> 
> The most important part of this schema is that elementFormDefault=
> "qualified".  That attribute forces instance documents to qualify all
> elements:
> 
> <?xml version="1.0"?>
> <video-game xmlns="http://www.video-game.org"
>           xmlns:design-works="http://www.design-works.com"
>           xmlns:disney="http://www.disney.com"
>           xmlns:mci="http://www.mci.com"
>           xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
>           xsi:schemaLocation="http://www.video-game.org VideoGame.xsd">
>   <design-works:geometry> 24m x 71m</design-works:geometry>
>   <disney:lighting>Shadow in foreground, light in back</disney:lighting>
>   <mci:voice>Digitized voice</mci:voice>
> <video-game>
> 
> [2] Declare Elements Globally  to Force the Use of Namespace Qualifiers
> 
> Global elements must be qualified in instance documents regardless of
> whether elementFormDefault has the value of  "qualified" or
> "unqualified".  Thus, we could reorganize the above schema to make all
> the elements global.  [Interestingly, for the video-game schema I don't
> see how to make geometry, lighting, and voice global.  Any thoughts?]
> 
> Now it is time to answer the question: what characterizes systems for
> which it makes sense to design the schema so that instance documents are
> forced to display the namespaces for each element?
> 
> One quick answer is:
> 
> "For systems where knowledge of the namespaces DOES provide additional
> information design the schema to force exposure of namespaces in
> instance documents."
> 
> However, this leaves me a bit empty.  When does "knowledge of the
> namespaces provide additional information"?  That is the question which
> must be answered.
> 
> Suppose that an application will process the geometry element
> differently if it's associated with design-works versus some other
> namespace.  I could imagine for marketing purposes such preferential
> treatment may occur.  When else?  What are your thoughts on this?
> 
> Clearly namespaces are great for dealing with name collisions.  In the
> video-game example I don't have multiple elements with the same name.
> If I did, however, and they came from different namespaces then it is
> easy to imagine that we would want to design the schema to force
> instance documents to expose the namespaces so that applications could
> easily distinguish the elements.
> 
> Let's try rephrasing the above characterization given this new
> information:
> 
> "For systems where knowledge of the namespaces does provide additional
> information design the schema to force exposure of namespaces in
> instance documents.  Knowledge of namespaces may enable applications
> with:
> - namespace-dependent processing, and
> - distinguishing between elements with the same name."
> 
> Okay, that's enough for now.  Your turn.  What are your thoughts on any
> of this?  What guidelines would you provide someone who asks you:
> "Should I design my schema to hide the namespace complexities, or should
> I design it to force instance documents to expose the namespaces of its
> elements?"
> 
> /Roger
References:
- RE: XML Schemas: Best Practices
  - From: johns@syscore.com (John F. Schlesinger)
Prev by Date: RE: XML Schemas: Best Practices
Next by Date: Re: LC-117: Locating schema resources
Previous by thread: RE: XML Schemas: Best Practices
Next by thread: Re: XML Schemas: Best Practices
Index(es):
- Date
- Thread