[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: Single, Simple, Powerful Mechanism for Expressing XML Relationships
- From: "Andrew S. Townley" <ast@atownley.org>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Fri, 10 Dec 2010 03:44:49 +0000
Hi Roger,
On 9 Dec 2010, at 10:32 AM, Costello, Roger L. wrote:
> Andrew Townley wrote:
>
>> Here are some of the initial thoughts I had on how to apply
>> TMRM fundamentals to the problems being discussed on the
>> list about where to go with XML. ...
>
> Hi Andrew,
>
> This is excellent information. I am still working to fully grasp it all. Perhaps some concrete examples would be instructive.
Before I try and walk through specific examples, it occurred to me that what I'm really talking about is doing something similar to what Jef Poskanzer did with pbmplus to solve the Tower of Babel problem with respect to graphics formats in the '80s. The approach I'm suggesting is really targeted specifically at the data representation domain and not trying to tackle the whole information semantics issue. Let's leave that for another day... ;)
Reframing the discussion along these lines, what I'm talking about is expressing the structure of any data format in a common model, then allowing specific, concrete mappings (expressed in terms of legends) between various physical formats (XML, JSON, XSD, CSV, RDBMS, etc.) to and from this common model (TMRM). As such, any particular mapping would be created and overlayed on any other useful mapping so that you can easily solve whatever data access issue you might have so solve a higher level problem. The plus side of this approach is that you can layer higher-level domain information in exactly the same manner on the same underlying model.
As I said before, I think this is totally possible to meet the various past, present and future needs being discussed as part of this thread.
What follows is some totally off-the-cuff thinking about how you could describe these mappings to address the situations you mention. These aren't particularly in my sweet spot of knowledge, because I generally avoid XSD whenever I can (preferring RNG when required), and I'm by no means an XSLT expert! These days I meet the requirements of those technologies using topic maps, and topic maps applications, so I'm a bit rusty... ;)
I'm hoping that people with better XSD/XSLT chops than me can understand what I'm talking about enough to provide more correct and meaningful examples. I'm going to use Ruby Hashes to represent the proxies mostly because Ruby has built-in support for symbols vs. strings. This is a distinction that I've found quite useful in my own work. In other languages, you'd have to invent a type or a string regex match.
The commentary below will also make more sense if you read from the bottom upwards, because the last example you give is pretty fundamental to the others, I think.
> In an earlier message I gave examples of the multiple ways that relationships are expressed in XML, XSLT, and XML Schema. Can we please take some of those examples and recast them using the concepts you described?
>
> First, let's start with an XML Schema example. XML Schemas makes frequent use of QNames to express relationships--one element is connected (related) to another element through a shared QName. In the below XML Schema snippet this Book element declaration:
>
> <element name="Book">
>
> is referenced from within BookStore:
>
> <?xml version="1.0"?>
> <schema xmlns="http://www.w3.org/2001/XMLSchema"
> targetNamespace="http://www.books.org"
> xmlns:bk="http://www.books.org"
> elementFormDefault="qualified">
>
> <element name="BookStore">
> <complexType>
> <sequence>
> <element ref="bk:Book" maxOccurs="unbounded"/>
> </sequence>
> </complexType>
> </element>
>
> <element name="Book">
> ...
>
> </schema>
>
> These two expressions:
>
> ref="bk:Book"
> name="Book"
>
> are connected by the shared QName:
>
> {http://www.books.org}Book
>
>
> How would this XML Schema example be recast to use the concepts you describe?
I would suggest that you're overloading what's really going on here and simplifying it to the common QName. XSD is a constraint language effectively, so what you're doing here is defining constraints on the relationship between two elements.
As constraints, the XSD above is applied to the subject map defining several proxies. This is what I meant by having your constraints be defined in terms of proxies. If you first convert the XSD into core element proxies, then you can apply fixed semantics (defined by your documentation and/or legend) and interpret the relationships between those proxies in the same way.
As I said before, it's just like pbmplus. The intermediate format (PxM or TMRM) is just a unified way to represent the information that already exists in a different representation. The same way you have the "same" image data if you convert a JPEG to PPM, you'll have the same set of relationships described by the XSD constraints, it's just described in a more universal manner.
Each of the XSD elements "mean" something specific. That meaning wouldn't be lost expressing them as TMRM proxies, but the way you express those relationships might be done using some standard proxy types that were more generic. Interpretation of the semantics is what's on the other end.
All you're really saying above is that if you want to create a parent-child relationship for proxies of type :BookStore, the type constraint defined would be applied so that you could ensure this was the case.
Normally, thanks to the path language, you really don't need to do this sort of thing in practice. Since you can apply the path expressions to proxies that return values or other proxies, if you encountered a proxy of type BookStore, then you could just ask for all of the players of a parent-child relationship where the specific :BookStore proxy instance played the role of :parent (or was the value of the key) and the type of the players of the :child role was :Book.
While you *could* constrain the relationships, the nature of what you're trying to do with subject maps doesn't require you to because you can always navigate through the graph to select only those nodes you're interested in. For XML, it's like all the advantages of RNG in terms of specifying variable element ordering, etc. If you care about books, you ask for it. Your legend says that there ought to be a parent-child relationship between :BookStore proxies and :Book proxies, so you ask for those--either in terms of all the :BookStore instances or in terms of only a specific one.
If it doesn't exist, then fine. If :BookStore proxies suddenly need to participate in other relationships, then that's just dandy too--your existing application won't even know they exist. Forwards and backwards compatibility without all the XSD pain and extension point gyrations. Whoohoo!! :)
> Second, let's consider an XSLT example. The XSLT document() function is a mechanism for expressing cross-document relationships. Here's an example:
>
> <xsl:for-each select="document('bookstore2.xml')//book">
> ...
> </xsl:for-each>
>
> How would this XSLT example be recast to use the concepts you describe?
This is just path navigation possible through the standard path navigation once you had a legend described for representing XML documents as TMRM subject maps (see below for one approach). I'm using the notation at the end of the TMRM v7, but as I don't use it regularly I'm explaining it in text too--just in case.
Using the proxies navigation operation (v <-- k) which returns all proxies which have label k equal to value v, you would simply use:
M1 |= ( :book <- :isa )
M is the resulting proxy map of all the :book elements in the subject map representing the bookstore2.xml document. The select scopes the operation to the specific subject map, so no other constraints in that regard would be necessary.
Next we need to ensure we maintain the node order, but that can be expressed in terms of the sort operator that I didn't mention (equation #30 in the TMRM v7 spec). "(x)" is the constraint operator, so below we're using the sort constraint.
M2 |= M1 (x) sort([ ORDERING_TUPLE_SEQUENCE])
The ordering tuple would determine how the resulting proxies were arranged, and could be defined in terms of how you maintained the order (see below). Map M2 would contain every proxy of type :book in the order you specified.
> Third, let's consider an ID/IDREF example. The below snippet shows a relationship between Picker John and Lot 1; namely, Picker John is located on Lot 1. This is employing the ID/IDREF relationship mechanism provided by the XML specification.
>
> <Lot id="1">
> ...
> </Lot>
> <Picker id="John" locatedOn="1">
> ...
> </Picker>
>
> How would this ID/IDREF example be recast to use the concepts you describe?
This one's actually the easiest, I think. However, you're not going to get utility without defining a few things as part of the legend (which, handily enough, you can also define in terms of TMRM proxies if you wanted, or RDF, or an XML vocabulary, or whatever you had to hand that the tools could understand).
At a very basic level using the "element as proxy" approach, you'd have two proxies representing the elements themselves. To keep things simple, I'm not going to change the semantics of what they are, but I'm going to base things off the DOM. You don't have to do this, but it might certainly be useful.
Step #1: Convert the nodes to proxies
:"1" = { :isa => :Lot, :text => "..." }
:John = { :isa => :Picker, :locatedOn => :"1", :text => "..." }
You can represent the child nodes in several different ways, depending on what makes the most sense. However, I would use some kind of ordered proxy reference between the node and its immediate children so that you would maintain the natural ordering of XML elements.
Step #2: Define your Legend(s)
As part of your legend, you would have a constraint the defined the :locatedOn role for proxies of type :Picker to be a proxy reference. Once that was done, you'd be able to run your XML through some kind of tool to result in the graph relationship described above.
Step #3: Traverse your Subject Map using the path language.
The beauty with defining the interpretation of the value of the :locatedOn property role in terms of constraints is that if tomorrow you need to interpret it differently, you just modify your constraints and/or apply a different legend. Same set of proxies, potentially different interpretations--all nicely traceable because your system ought to be able to tell you what legends are currently being applied. You can even get really crazy and apply different legends to different contexts for the same underlying subject map (think security profiles) if your implementation supports it. Very cool stuff indeed, I think! :)
Hopefully this makes more sense to you now and will help other people see why I think this is a pretty important evolutionary idea.
Cheers,
ast
--
Andrew S. Townley <ast@atownley.org>
http://atownley.org
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]