OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Cross document validation with Schematron - XML syntax for

[ Lists Home | Date Index | Thread Index ]
  • To: <xml-dev@lists.xml.org>
  • Subject: RE: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?
  • From: "Hunsberger, Peter" <Peter.Hunsberger@stjude.org>
  • Date: Wed, 17 Mar 2004 15:05:41 -0600
  • Thread-index: AcQIVl/Vt1yM3YONTlGOjvwGFie1LwEDNarg
  • Thread-topic: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?

Given the verbosity of this post I'm not surprised that it didn't garner
much response, but I am a tad surprised that no on had any comments

Is there somewhere else to go for Schematron advice?

No comments about the heresy of questioning the need for an XML syntax
for Xpath?

> First some background: we have a large complex Web app that 
> builds 1000's of different input forms from metadata 
> descriptions in the form of XML.  This XML comes from many 
> different spots and describes global metadata, user view 
> specific metadata, authorizations and the current data for a 
> given screen.  XSLT transforms take this input XML smashes it 
> together into an abstract object mode and this is in turn 
> forwarded on to another XSLT that does presentation specific 
> transformations (for the Web app that means turning it into XHTML).  
> The user does a standard HTTP POST back to us, and we get the 
> request parameters back as XML and run another XSLT transform 
> and then a Schematron transform to validate the input from 
> the user.  If Schematron throws an assert we detect that and 
> recycle the input back through the original loop with the 
> appropriate error message otherwise we continue on to the next screen.
> The original metadata and the instance specific screens that 
> are built around them are built by business analysts using 
> other screens (that are in turn built by the same system).  
> In particular, we have a validation editor where they 
> describe the validation rules for the input to any given 
> screen.  These rules are one step removed from Schematron 
> statements; a simple transform turns them into Schematron. 
> The main reason for not specifying Schematron directly is so 
> that the "validation editor" can pick the rules into 
> component pieces when a business analysts wants to go back 
> and edit an existing validation rule; we use XML elements and 
> attributes to build the Xpath, that way we don't have to 
> parse the Xpath (though we're probably going to go to XSLT 
> for regex support so I suppose parsing the Xpath with regex 
> would be just about the same work in the long term).
> All this works pretty well, but for one issue which I will 
> describe shortly.  However, we now have a new requirement 
> which is to be able to validate across multiple documents.  
> We manage clinical research data, so an example would be for 
> someone to be able to specify that a surgery date was after 
> any protocol on study date, or that a surgery date is after a 
> particular instance of an protocol on study date.  In this 
> case, the data being validated is in the surgery document and 
> the data it is being validated against is in the protocol 
> document.  (In reality all this is pulled out of a database 
> on the fly, but the mechanics of how these documents are 
> actually created should be more or less irrelevant to the 
> problem at hand?)
> First issue:
> Writing Schematron asserts can be non-intuitive for a 
> business analyst. Consider, for example, a document that 
> reports many lab results.  We may want to say that the ANC 
> value is between 1000 and 10000. As a Schematron assert it is 
> essentially:
> 	not(*[local-name() ='ANC']) or ( result_val > 1000 and 
> result_val < 10000)
> IE, for things that aren't ANC's we are ok, otherwise check 
> the result value.  The problem is that a business analyst 
> just doesn't get the
> "not(x) or" pattern, it might make sense to someone well 
> versed in Boolean logic and xpath, but even some of our more 
> experienced developers get confused on these rules.
> Given this, and the requirement for cross document validation 
> we'd like to move the input to our validation process one 
> more step away from Schematron and find or create a language 
> that can be used by the business analysts to specify the 
> validation rules in a manner that is a little more natural to 
> them.  For example:
> 	element = 'ANC' and result_val > 1000 and result_val < 10000
> For Schematron generation that's pretty straight forward, 
> however, more importantly, we also need to be able to use 
> this rule specification to tell us how to generate the other 
> document.  Considering my other example, we want something like:
> 	*[local-name() = 'surgery.date'] > *[local-name() = 
> 'protocol.on_study_date']
> Or 
> 	*[local-name() = 'surgery.date']  > *[local-name() = 
> 'protocol.on_study_date' and protocol.mnemonic = 'TOTXV']
> We want to be able to parse this rule specification to find 
> the fact that we have to do a retrieval of all the protocol 
> data that is in context for this particular patient (or the 
> protocol data in context that has a mnemonic='TOTXV').  
> Essentially, I think what we need is an XML syntax for xpath 
> that we can turn back into real xpath or be easily parsed so 
> things other than xpath savvy processors can generate data 
> sets that match the xpath.  
> We are running this all on top of Apache Cocoon with Saxon so 
> we more or less have any piece of XML or XSLT handling 
> machinery we might need available to us: protocol resolvers, 
> any and all manner of schema, XSLT in any version, Java 
> classes, and even Java extensions for XSLT if needed, though 
> I'd rather stay away from those.
> Sorry for the windy post, but finally the real questions: 
> anyone know of any "obvious" way to do this?  By obvious I 
> mean some existing spec, or best practice?  If not, any 
> thoughts on what a good structure for our artificial language 
> that is going to be fed into Schematron and our
> document retrieval process?   Am I missing something with respect to
> Schematron?  Could we hook into some underlying part of an 
> xpath parser and gain are understanding of the xpath there 
> instead of at the higher level (and thus not need the XML 
> syntax for xpath)? Other thoughts or comments?
> Peter Hunsberger


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS