Lists Home |
Date Index |
- To: <email@example.com>
- Subject: RE: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?
- From: "Hunsberger, Peter" <Peter.Hunsberger@stjude.org>
- Date: Wed, 17 Mar 2004 15:05:41 -0600
- Thread-index: AcQIVl/Vt1yM3YONTlGOjvwGFie1LwEDNarg
- Thread-topic: [xml-dev] Cross document validation with Schematron - XML syntax for Xpath?
Given the verbosity of this post I'm not surprised that it didn't garner
much response, but I am a tad surprised that no on had any comments
Is there somewhere else to go for Schematron advice?
No comments about the heresy of questioning the need for an XML syntax
> First some background: we have a large complex Web app that
> builds 1000's of different input forms from metadata
> descriptions in the form of XML. This XML comes from many
> different spots and describes global metadata, user view
> specific metadata, authorizations and the current data for a
> given screen. XSLT transforms take this input XML smashes it
> together into an abstract object mode and this is in turn
> forwarded on to another XSLT that does presentation specific
> transformations (for the Web app that means turning it into XHTML).
> The user does a standard HTTP POST back to us, and we get the
> request parameters back as XML and run another XSLT transform
> and then a Schematron transform to validate the input from
> the user. If Schematron throws an assert we detect that and
> recycle the input back through the original loop with the
> appropriate error message otherwise we continue on to the next screen.
> The original metadata and the instance specific screens that
> are built around them are built by business analysts using
> other screens (that are in turn built by the same system).
> In particular, we have a validation editor where they
> describe the validation rules for the input to any given
> screen. These rules are one step removed from Schematron
> statements; a simple transform turns them into Schematron.
> The main reason for not specifying Schematron directly is so
> that the "validation editor" can pick the rules into
> component pieces when a business analysts wants to go back
> and edit an existing validation rule; we use XML elements and
> attributes to build the Xpath, that way we don't have to
> parse the Xpath (though we're probably going to go to XSLT
> for regex support so I suppose parsing the Xpath with regex
> would be just about the same work in the long term).
> All this works pretty well, but for one issue which I will
> describe shortly. However, we now have a new requirement
> which is to be able to validate across multiple documents.
> We manage clinical research data, so an example would be for
> someone to be able to specify that a surgery date was after
> any protocol on study date, or that a surgery date is after a
> particular instance of an protocol on study date. In this
> case, the data being validated is in the surgery document and
> the data it is being validated against is in the protocol
> document. (In reality all this is pulled out of a database
> on the fly, but the mechanics of how these documents are
> actually created should be more or less irrelevant to the
> problem at hand?)
> First issue:
> Writing Schematron asserts can be non-intuitive for a
> business analyst. Consider, for example, a document that
> reports many lab results. We may want to say that the ANC
> value is between 1000 and 10000. As a Schematron assert it is
> not(*[local-name() ='ANC']) or ( result_val > 1000 and
> result_val < 10000)
> IE, for things that aren't ANC's we are ok, otherwise check
> the result value. The problem is that a business analyst
> just doesn't get the
> "not(x) or" pattern, it might make sense to someone well
> versed in Boolean logic and xpath, but even some of our more
> experienced developers get confused on these rules.
> Given this, and the requirement for cross document validation
> we'd like to move the input to our validation process one
> more step away from Schematron and find or create a language
> that can be used by the business analysts to specify the
> validation rules in a manner that is a little more natural to
> them. For example:
> element = 'ANC' and result_val > 1000 and result_val < 10000
> For Schematron generation that's pretty straight forward,
> however, more importantly, we also need to be able to use
> this rule specification to tell us how to generate the other
> document. Considering my other example, we want something like:
> *[local-name() = 'surgery.date'] > *[local-name() =
> *[local-name() = 'surgery.date'] > *[local-name() =
> 'protocol.on_study_date' and protocol.mnemonic = 'TOTXV']
> We want to be able to parse this rule specification to find
> the fact that we have to do a retrieval of all the protocol
> data that is in context for this particular patient (or the
> protocol data in context that has a mnemonic='TOTXV').
> Essentially, I think what we need is an XML syntax for xpath
> that we can turn back into real xpath or be easily parsed so
> things other than xpath savvy processors can generate data
> sets that match the xpath.
> We are running this all on top of Apache Cocoon with Saxon so
> we more or less have any piece of XML or XSLT handling
> machinery we might need available to us: protocol resolvers,
> any and all manner of schema, XSLT in any version, Java
> classes, and even Java extensions for XSLT if needed, though
> I'd rather stay away from those.
> Sorry for the windy post, but finally the real questions:
> anyone know of any "obvious" way to do this? By obvious I
> mean some existing spec, or best practice? If not, any
> thoughts on what a good structure for our artificial language
> that is going to be fed into Schematron and our
> document retrieval process? Am I missing something with respect to
> Schematron? Could we hook into some underlying part of an
> xpath parser and gain are understanding of the xpath there
> instead of at the higher level (and thus not need the XML
> syntax for xpath)? Other thoughts or comments?
> Peter Hunsberger