OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Cross document validation with Schematron - XML syntax for

[ Lists Home | Date Index | Thread Index ]

There is a proposed fine-grained XML syntax for XQuery called XQueryX at

http://www.w3.org/TR/2003/WD-xqueryx-20031219/

Since XPath is a subset of XQuery, this would appear to meet the need.

Michael Kay 

# -----Original Message-----
# From: Hunsberger, Peter [mailto:Peter.Hunsberger@stjude.org] 
# Sent: 17 March 2004 21:06
# To: xml-dev@lists.xml.org
# Subject: RE: [xml-dev] Cross document validation with 
# Schematron - XML syntax for Xpath?
# 
# Given the verbosity of this post I'm not surprised that it 
# didn't garner much response, but I am a tad surprised that no 
# on had any comments what-so-ever.
# 
# Is there somewhere else to go for Schematron advice?
# 
# No comments about the heresy of questioning the need for an 
# XML syntax for Xpath?
# 
# > 
# > First some background: we have a large complex Web app that builds 
# > 1000's of different input forms from metadata descriptions 
# in the form 
# > of XML.  This XML comes from many different spots and 
# describes global 
# > metadata, user view specific metadata, authorizations and 
# the current 
# > data for a given screen.  XSLT transforms take this input 
# XML smashes 
# > it together into an abstract object mode and this is in 
# turn forwarded 
# > on to another XSLT that does presentation specific transformations 
# > (for the Web app that means turning it into XHTML).
# > 
# > The user does a standard HTTP POST back to us, and we get 
# the request 
# > parameters back as XML and run another XSLT transform and then a 
# > Schematron transform to validate the input from the user.  If 
# > Schematron throws an assert we detect that and recycle the 
# input back 
# > through the original loop with the appropriate error 
# message otherwise 
# > we continue on to the next screen.
# > 
# > The original metadata and the instance specific screens 
# that are built 
# > around them are built by business analysts using other 
# screens (that 
# > are in turn built by the same system).
# > In particular, we have a validation editor where they describe the 
# > validation rules for the input to any given screen.  These 
# rules are 
# > one step removed from Schematron statements; a simple 
# transform turns 
# > them into Schematron.
# > The main reason for not specifying Schematron directly is 
# so that the 
# > "validation editor" can pick the rules into component pieces when a 
# > business analysts wants to go back and edit an existing validation 
# > rule; we use XML elements and attributes to build the 
# Xpath, that way 
# > we don't have to parse the Xpath (though we're probably 
# going to go to 
# > XSLT for regex support so I suppose parsing the Xpath with 
# regex would 
# > be just about the same work in the long term).
# > 
# > All this works pretty well, but for one issue which I will describe 
# > shortly.  However, we now have a new requirement which is 
# to be able 
# > to validate across multiple documents.
# > We manage clinical research data, so an example would be 
# for someone 
# > to be able to specify that a surgery date was after any protocol on 
# > study date, or that a surgery date is after a particular 
# instance of 
# > an protocol on study date.  In this case, the data being 
# validated is 
# > in the surgery document and the data it is being validated 
# against is 
# > in the protocol document.  (In reality all this is pulled out of a 
# > database on the fly, but the mechanics of how these documents are 
# > actually created should be more or less irrelevant to the 
# problem at 
# > hand?)
# > 
# > First issue:
# > 
# > Writing Schematron asserts can be non-intuitive for a business 
# > analyst. Consider, for example, a document that reports many lab 
# > results.  We may want to say that the ANC value is between 1000 and 
# > 10000. As a Schematron assert it is
# > essentially:
# > 
# > 	not(*[local-name() ='ANC']) or ( result_val > 1000 and 
# result_val < 
# > 10000)
# > 
# > IE, for things that aren't ANC's we are ok, otherwise check 
# the result 
# > value.  The problem is that a business analyst just doesn't get the
# > "not(x) or" pattern, it might make sense to someone well versed in 
# > Boolean logic and xpath, but even some of our more experienced 
# > developers get confused on these rules.
# > 
# > Given this, and the requirement for cross document validation we'd 
# > like to move the input to our validation process one more step away 
# > from Schematron and find or create a language that can be 
# used by the 
# > business analysts to specify the validation rules in a 
# manner that is 
# > a little more natural to them.  For example:
# > 
# > 	element = 'ANC' and result_val > 1000 and result_val < 10000
# > 
# > For Schematron generation that's pretty straight forward, however, 
# > more importantly, we also need to be able to use this rule 
# > specification to tell us how to generate the other document.  
# > Considering my other example, we want something like:
# > 
# > 	*[local-name() = 'surgery.date'] > *[local-name() = 
# > 'protocol.on_study_date']
# > 
# > Or
# > 
# > 	*[local-name() = 'surgery.date']  > *[local-name() = 
# > 'protocol.on_study_date' and protocol.mnemonic = 'TOTXV']
# > 
# > We want to be able to parse this rule specification to find 
# the fact 
# > that we have to do a retrieval of all the protocol data that is in 
# > context for this particular patient (or the protocol data 
# in context 
# > that has a mnemonic='TOTXV').
# > Essentially, I think what we need is an XML syntax for 
# xpath that we 
# > can turn back into real xpath or be easily parsed so things 
# other than 
# > xpath savvy processors can generate data sets that match the xpath.
# > 
# > We are running this all on top of Apache Cocoon with Saxon 
# so we more 
# > or less have any piece of XML or XSLT handling machinery we 
# might need 
# > available to us: protocol resolvers, any and all manner of schema, 
# > XSLT in any version, Java classes, and even Java extensions 
# for XSLT 
# > if needed, though I'd rather stay away from those.
# > 
# > Sorry for the windy post, but finally the real questions: 
# > anyone know of any "obvious" way to do this?  By obvious I 
# mean some 
# > existing spec, or best practice?  If not, any thoughts on 
# what a good 
# > structure for our artificial language that is going to be fed into 
# > Schematron and our
# > document retrieval process?   Am I missing something with respect to
# > Schematron?  Could we hook into some underlying part of an xpath 
# > parser and gain are understanding of the xpath there 
# instead of at the 
# > higher level (and thus not need the XML syntax for xpath)? Other 
# > thoughts or comments?
# > 
# > Peter Hunsberger
# > 
# 
# 
# -----------------------------------------------------------------
# The xml-dev list is sponsored by XML.org 
# <http://www.xml.org>, an initiative of OASIS 
# <http://www.oasis-open.org>
# 
# The list archives are at http://lists.xml.org/archives/xml-dev/
# 
# To subscribe or unsubscribe from this list use the subscription
# manager: <http://www.oasis-open.org/mlmanage/index.php>
# 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS