OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Cross document validation with Schematron - XML syntax for

[ Lists Home | Date Index | Thread Index ]
  • To: <xml-dev@lists.xml.org>
  • Subject: RE: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?
  • From: "Hunsberger, Peter" <Peter.Hunsberger@stjude.org>
  • Date: Wed, 17 Mar 2004 15:05:41 -0600
  • Thread-index: AcQIVl/Vt1yM3YONTlGOjvwGFie1LwEDNarg
  • Thread-topic: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?

Given the verbosity of this post I'm not surprised that it didn't garner
much response, but I am a tad surprised that no on had any comments
what-so-ever.

Is there somewhere else to go for Schematron advice?

No comments about the heresy of questioning the need for an XML syntax
for Xpath?

> 
> First some background: we have a large complex Web app that 
> builds 1000's of different input forms from metadata 
> descriptions in the form of XML.  This XML comes from many 
> different spots and describes global metadata, user view 
> specific metadata, authorizations and the current data for a 
> given screen.  XSLT transforms take this input XML smashes it 
> together into an abstract object mode and this is in turn 
> forwarded on to another XSLT that does presentation specific 
> transformations (for the Web app that means turning it into XHTML).  
> 
> The user does a standard HTTP POST back to us, and we get the 
> request parameters back as XML and run another XSLT transform 
> and then a Schematron transform to validate the input from 
> the user.  If Schematron throws an assert we detect that and 
> recycle the input back through the original loop with the 
> appropriate error message otherwise we continue on to the next screen.
> 
> The original metadata and the instance specific screens that 
> are built around them are built by business analysts using 
> other screens (that are in turn built by the same system).  
> In particular, we have a validation editor where they 
> describe the validation rules for the input to any given 
> screen.  These rules are one step removed from Schematron 
> statements; a simple transform turns them into Schematron. 
> The main reason for not specifying Schematron directly is so 
> that the "validation editor" can pick the rules into 
> component pieces when a business analysts wants to go back 
> and edit an existing validation rule; we use XML elements and 
> attributes to build the Xpath, that way we don't have to 
> parse the Xpath (though we're probably going to go to XSLT 
> for regex support so I suppose parsing the Xpath with regex 
> would be just about the same work in the long term).
> 
> All this works pretty well, but for one issue which I will 
> describe shortly.  However, we now have a new requirement 
> which is to be able to validate across multiple documents.  
> We manage clinical research data, so an example would be for 
> someone to be able to specify that a surgery date was after 
> any protocol on study date, or that a surgery date is after a 
> particular instance of an protocol on study date.  In this 
> case, the data being validated is in the surgery document and 
> the data it is being validated against is in the protocol 
> document.  (In reality all this is pulled out of a database 
> on the fly, but the mechanics of how these documents are 
> actually created should be more or less irrelevant to the 
> problem at hand?)
> 
> First issue:
> 
> Writing Schematron asserts can be non-intuitive for a 
> business analyst. Consider, for example, a document that 
> reports many lab results.  We may want to say that the ANC 
> value is between 1000 and 10000. As a Schematron assert it is 
> essentially:
> 
> 	not(*[local-name() ='ANC']) or ( result_val > 1000 and 
> result_val < 10000)
> 
> IE, for things that aren't ANC's we are ok, otherwise check 
> the result value.  The problem is that a business analyst 
> just doesn't get the
> "not(x) or" pattern, it might make sense to someone well 
> versed in Boolean logic and xpath, but even some of our more 
> experienced developers get confused on these rules.
> 
> Given this, and the requirement for cross document validation 
> we'd like to move the input to our validation process one 
> more step away from Schematron and find or create a language 
> that can be used by the business analysts to specify the 
> validation rules in a manner that is a little more natural to 
> them.  For example:
> 
> 	element = 'ANC' and result_val > 1000 and result_val < 10000
> 
> For Schematron generation that's pretty straight forward, 
> however, more importantly, we also need to be able to use 
> this rule specification to tell us how to generate the other 
> document.  Considering my other example, we want something like:
> 
> 	*[local-name() = 'surgery.date'] > *[local-name() = 
> 'protocol.on_study_date']
> 
> Or 
> 
> 	*[local-name() = 'surgery.date']  > *[local-name() = 
> 'protocol.on_study_date' and protocol.mnemonic = 'TOTXV']
> 
> We want to be able to parse this rule specification to find 
> the fact that we have to do a retrieval of all the protocol 
> data that is in context for this particular patient (or the 
> protocol data in context that has a mnemonic='TOTXV').  
> Essentially, I think what we need is an XML syntax for xpath 
> that we can turn back into real xpath or be easily parsed so 
> things other than xpath savvy processors can generate data 
> sets that match the xpath.  
> 
> We are running this all on top of Apache Cocoon with Saxon so 
> we more or less have any piece of XML or XSLT handling 
> machinery we might need available to us: protocol resolvers, 
> any and all manner of schema, XSLT in any version, Java 
> classes, and even Java extensions for XSLT if needed, though 
> I'd rather stay away from those.
> 
> Sorry for the windy post, but finally the real questions: 
> anyone know of any "obvious" way to do this?  By obvious I 
> mean some existing spec, or best practice?  If not, any 
> thoughts on what a good structure for our artificial language 
> that is going to be fed into Schematron and our
> document retrieval process?   Am I missing something with respect to
> Schematron?  Could we hook into some underlying part of an 
> xpath parser and gain are understanding of the xpath there 
> instead of at the higher level (and thus not need the XML 
> syntax for xpath)? Other thoughts or comments?
> 
> Peter Hunsberger
> 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS