xml-dev - Re: [xml-dev] Schema Extensibility

Re: [xml-dev] Schema Extensibility
[ Lists Home | Date Index | Thread Index ]
To: gkholman@CraneSoftwrights.com, xml-dev@lists.xml.org
Subject: Re: [xml-dev] Schema Extensibility
From: "Fraser Goffin" <goffinf@hotmail.com>
Date: Wed, 01 Mar 2006 14:49:02 +0000
Bcc:
In-reply-to: <7.0.1.0.2.20060301071528.01361ec0@CraneSoftwrights.com>
Wow, thanks Ken, I hadn't come across NVDL before.

I would like to give the XSD example a whirl on our schema too, so if you 
get it going do please let me know.

As always there is the concern about availability of main-stream 
implementations on Windows and non windows platforms (we mostly use these in 
production). You appear optimistic about this ??

Fraser.

>From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
>To: xml-dev@lists.xml.org
>Subject: Re: [xml-dev] Schema Extensibility
>Date: Wed, 01 Mar 2006 08:07:13 -0500
>
>At 2006-03-01 11:47 +0000, Fraser Goffin wrote:
>>Personally I have been relating these comments to XML schema structures 
>>but I could have easily been talking about the service interface supported 
>>by those schema. This has highlighted some different opinions about the 
>>value of various approaches to this problem which I hope have resonated 
>>with those following the thread.
>>...
>>I guess the thing that I am still mostly undecided about is to do with 
>>whether to allow for schema extensibility (using xs:any together with the 
>>'sentry' approach proposed by David Orchard (and others) or whether this 
>>is a recipe for an uncontrollable vocabulary.
>
>I think the latter.
>
>>I think the battle-ground is in part characterised by a schema (or 
>>service) that, once published is considered as immutable, hence any 
>>changes REQUIRE a NEW VERSION with a NEW NAMESPACE, versus a schema which 
>>allows non breaking changes to be introduced by both the schema owner and 
>>non schema authors and supports both forward and backwards compatibility.
>
>I feel there are manageable ways to accommodate changing namespaces in 
>stylesheet libraries and other downstream processes.  Namespace URI strings 
>are, after all, just strings, and both XML (with entities) and programming 
>languages have imaginative ways to work with strings.
>
>Namespaces provide disambiguation and global type identity (labels) ... if 
>the processing of an information item changes, or the processing of a 
>collection of information items in a vocabulary changes, then using a new 
>namespace unambiguously indicates that there is something different than 
>before.
>
>>The first situation is a 'clean' and explicit model where the semantics 
>>are guaranteed not to be usurped by a non schema owner but where even 
>>relatively minor change requirements can have a large impact to 
>>implementations (especially when there are a large number of external 
>>users of this vocabulary).
>
>Indeed.
>
>>The schema extensibility approach promotes the idea that organisations may 
>>want to represent private relationships using data carried at specified 
>>points within the standard schema in such a way that that data is only 
>>relevant between those parties (using a foreign namepsace) and all others 
>>can safely ignore it (and that the schema author should not necessarily 
>>attempt to constrain this type of usage).
>
>A very important issue, and one that needs to be addressed in UBL.
>
>>some TP extensions may be incorporated back into the main body of the 
>>standard at a later point in which case anyone pair or parties using that 
>>extension can agree a move back to the standard definition, at a time of 
>>their choosing.
>
>But in the meantime trading partners can continue to use the sacrosanct 
>structures and just embed in them richness that is important to them, 
>provided that the standardized structures (possibly redundantly) carry the 
>aggregate information.  This allows compliant applications unaware of the 
>embedded richness to still "do their thing" with the recognized 
>standardized constructs.
>
>>It also allows the schema owner to add non breaking 'compatible' change to 
>>a schema. The down sides seem to be, that a TP could introduce changes 
>>which subvert the intended semantics, and that, over time, what might have 
>>started out as a temporary expedient, turns into an entrenched working 
>>implementation that is unlikely to be allocated budget to be 
>>re-synchronised with the standard.
>
>I'm not so sure that embedding foreign information items into a given 
>structure would necessarily change the semantics of the information in that 
>structure.
>
>I've long held that semantics are in the eye of the consuming process and 
>that information *means* only what the recipient wants it to mean.  Of 
>course reliable discourse happens when the recipient interprets it to mean 
>what the sender intended, but the recipient can also choose to interpret it 
>any way they want for their own purpose.  Therefore, say with UBL, if I 
>have a UBL processing application that understands the meaning of the 
>information labelled according to the labels published by the UBL TC, no 
>amount of embedded foreign information is going to impact my semantic 
>interpretation of what the committee intended.
>
>It does put the burden on the sender, though, not to ignore the semantics 
>represented by the labels chosen by the TC, so it would benefit the sender 
>to respect the vocabulary labels and populate the structures with 
>meaningful information to a downstream conformant UBL processor.  But a 
>trading partner who understands the foreign information will suddenly have 
>the additional information available to them because they will have a 
>semantic understanding of the information found with the foreign labels.
>
>XML doesn't "do" semantics, I believe it just labels the information in the 
>structures with rich, globally-unique, namespace-based labels to effect 
>interchange without ambiguously losing the labeling of the information.  
>How the sender and receiver trading partners interpret the semantics of the 
>information at those labels is their business, and their business will 
>flourish if they have the same understanding.  XML won't give them that 
>magic understanding.
>
>>So, in part the question is, should a schema allow for unknown extensions 
>>for unknown purposes (but in specified locations) and still be considered 
>>as 'compliant', or should schema authors attempt to constrain (eliminate) 
>>that behaviour.
>
>Neither, I believe.
>
>>I can't help feeling the attraction of the second model, but my 'gut' 
>>tells me that something as inflexible will soon become a business 
>>constraint and that will signal it's demise.
>
>Extensions are, I believe, out of scope of the original vocabulary, and 
>therefore, "none of the business" of the original vocabulary and "not even 
>a worry" to the original vocabulary (or its creators!).
>
>It happens that last night I expounded on this very point to the UBL 
>committee in order to present how I believe trading partner extensions to 
>UBL can be easily accommodated *by doing nothing* within the UBL 
>structures:
>
>   http://lists.oasis-open.org/archives/ubl/200602/msg00117.html
>
>In that posting I present the scenario that the UBL TC has standardized 
>what an Order is, but that two trading partners in the aerospace industry 
>need to augment the Order with richness important to them, yet they don't 
>want to violate UBL or be considered non-compliant.  I posit that the 
>aerospace industry *can do anything they want* to augment a UBL Order and 
>they will *still be UBL compliant* if they use the UBL Order information 
>compliant with the semantics attached to those labels by the UBL Technical 
>Committee (and published in five languages so far) in the instance they 
>exchange.
>
>I demonstrate how NVDL can be used for just this purpose, *without making a 
>single change to the read-only UBL document models as published* and I come 
>to the conclusion:
>
>At 2006-02-28 21:57 -0500, G. Ken Holman wrote:
>>I see the basic premise as:
>>
>>  - the UBL information in an order instance has to conform to the 
>>sacrosanct, read-only document models created by the UBL Technical 
>>Committee;
>>  - at the least, a user of augmented orders must fill in the UBL fields 
>>so that recipients who do not recognize the augmentations can ignore them 
>>because the fields they do recognize they know what to do with;
>>  - users who choose to recognize the embedded augmentations can do what 
>>they wish with them, just the act of having them doesn't "disturb" the UBL 
>>information in the instance.
>>
>>This is a different way than the traditional way of looking at document 
>>validation where you have to have the one model of everything in the 
>>instance, but it really isn't foreign.  Consider that you have an XHTML 
>>document ... if you choose to embed an SVG image in the middle of the 
>>document, you still really do have an XHTML document just with something 
>>inside.  Why burden the XHTML document model with knowledge of SVG details 
>>in order to do validation?  With namespace-based validation dispatching, 
>>the detection of SVG in an XHTML document can trigger the validation of 
>>the SVG component with the SVG model, while making the SVG invisible to 
>>the validation of the wrapping XHTML from the XHTML model.
>>
>>So as trading partners we don't "validate an instance as valid UBL"; 
>>instead we "validate the UBL information in an instance as valid UBL", as 
>>well as checking whatever other information we might also have in our 
>>instance that is important to our exchange.
>
>That last paragraph is the important change in perspective that I'm trying 
>to bring to light to the committee.  There are some who still hold with a 
>traditional view that the entire instance *has a model*, rather than the 
>different view that sets of labeled information found in an instance *each 
>have their own model* (and when there is only one model then the model is 
>for the entire instance, but that is just an edge case; granted one that 
>we've been using all along for markup).  And those sets are identified 
>unambiguously through the use of namespace-rich labels.
>
>Accommodating "the entire XML instance has a model" is, I believe, more 
>difficult, time consuming and frustrating than accommodating "each set of 
>information found in an XML instance has its own model".
>
>At 2006-03-01 11:47 +0000, Fraser Goffin wrote:
>>With my SOA hat on I would recognise the importance of interoperability 
>>and the significant role that standardised vocabularies have to play.
>
>Great!  Standardized vocabularies give us the labels with which we can 
>identify the information unambiguously so that we hopefully apply the 
>understood published semantics against the so-labeled information.
>
>>I also don't especially want to promote the myriad of point-to-point 
>>relationships that 'going private' implies and instead want to leverage 
>>the 'reach' of a market standard.
>
>All power to you!  And I believe it can be done *arbitrarily* between 
>trading partners without impacting the integrity of the standardized 
>vocabulary.
>
>>Personally I still have no definative conclusion that I feel comfortable 
>>in turning into a recommended approach within my own organisation and 
>>within the industry standards body that I work with from time to time, so 
>>I thought I'd give it one more go.
>
>I've come to the conclusion that the technology standards are in place and 
>the tools are coming with which these problems are addressed and committees 
>like the UBL TC can go on its way doing its own thing and standardizing a 
>set of labels and understood semantics as a platform on which anyone 
>wishing to augment those labels with their own representing their own 
>semantic concepts can do so without worry and without upsetting the 
>standards.
>
>Note that this is not an official position held by the UBL TC, as it was 
>only yesterday that I expounded on my ideas to the committee.  I cannot 
>represent the above as an official UBL description of its extensibility, 
>only as my input to the UBL discussion of extensibility.  When the TC comes 
>to a decision of how to accommodate extensibility in UBL structures, this 
>will be documented in detail to help UBL users.  I do gather there is some 
>resonance in the TC regarding my input, but I have also heard some 
>reluctance of "but that isn't how we've been thinking of doing it and we 
>need to do it this other way (e.g. W3C Schema ANY) like other projects".
>
>>Some of the issues and comments highlighted by the earlier thread are 
>>provided below. Some are direct quotes from contributors, others are 
>>excepts from various ramblings :-)
>
>I feel that it is out of scope to have to think of how to accommodate the 
>arbitrary (and imaginative!) ways people might want to augment my 
>structures.  If it were left to me to make my structures extensible, 
>whatever way I chose would make someone unhappy because they wouldn't be 
>able to extend it the way they want.  By punting on the whole issue, I can 
>focus on my structures, I can prepare my processing applications to 
>accommodate (and ignore) the presence of foreign content, and go about my 
>business while others can augment what I do to meet their purposes provided 
>they don't violate my processing systems by ambiguously using the labels 
>I've published as my vocabulary.
>
>BTW, XSL-FO has done this since 2001 when it was published ... XSL-FO 
>processors accommodate the presence of foreign namespace labels in the 
>information structures and quietly ignores their content ... and because of 
>this I have done some very imaginative and fun things in UBL annotating 
>XSL-FO instances to synthesize XSLT stylesheets that then get processed for 
>production use (freely available from our web site for those interested).  
>The XSL-FO designers had no idea what I wanted to do to augment their work, 
>but I was able to do it without needing any extensibility structures built 
>into their vocabulary.  I demonstrated this use of namespaces years ago and 
>I have long thought that this benign accommodation of foreign content using 
>namespaces is the ultimate in extensibility.
>
>I hope this helps.
>
>. . . . . . . . . . . . . . . Ken
>
>--
>Upcoming XSLT/XSL-FO hands-on courses: Washington,DC 2006-03-13/17
>World-wide on-site corporate, govt. & user group XML/XSL training.
>G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
>Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/x/
>Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
>Male Cancer Awareness Aug'05  http://www.CraneSoftwrights.com/x/bc
>Legal business disclaimers:  http://www.CraneSoftwrights.com/legal
>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>
>
Follow-Ups:
- Re: [xml-dev] Schema Extensibility
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
References:
- Re: [xml-dev] Schema Extensibility
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
Prev by Date: RE: [xml-dev] Schema Extensibility
Next by Date: RE: [xml-dev] Schema Extensibility
Previous by thread: Re: [xml-dev] Schema Extensibility
Next by thread: Re: [xml-dev] Schema Extensibility
Index(es):
- Date
- Thread