[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Enhanced Control using Fine-Grain Validation
- From: Michael Kay <mike@saxonica.com>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Fri, 19 Jul 2013 11:03:58 +0100
In fairness, your first sentence needs qualifying
>When an XML instance document is validated against an XML Schema, the validator returns valid or invalid.
That might be the way many schema validators behave, but it's not what the specification says. The specification says that the validator returns a PSVI, in which individual nodes are annotated with a number of properties, including the type against which validation was attempted, and whether or not validation of each node was successful. What you are doing is trying to get a little bit closer to that concept "by hand" in the absence of tools that do what the spec says they should do.
There are couple of major limitations in your approach:
(a) it only works for validation against simple types (because of the reliance on "castable as")
(b) your stylesheet decides which types to use for validating each element, rather than getting this information from the schema.
I think this can be improved in the presence of XSLT 3.0 try/catch. As a first cut, you can do something like
<xsl:template match="*">
<xsl:try>
<xsl:copy validation="lax">
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
<xsl:catch>
<xsl:copy>
<xsl:attribute name="invalid" select="true()"/>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:try>
</xsl:template>
This will produce a copy of your document in which valid elements have the appropriate type annotation, and invalid elements are (a) labelled as xs:untyped, and (b) have the attribute invalid="true".
You could extend this by replacing the xsl:catch block with say xsl:apply-templates mode="repair", and including repair rules for specific elements in this mode.
One limitation is that all elements are validated against global element declarations in the schema, which doesn't take account of the fact that the schema might include context sensitive (local) validation rules for some element names.
Michael Kay
Saxonica
On 19 Jul 2013, at 10:39, Costello, Roger L. wrote:
> Hi Folks,
>
> When an XML instance document is validated against an XML Schema, the validator returns valid or invalid. If invalid is returned, the XML document is rejected. If valid is returned, the XML document is accepted and processed. The decision to process inputs is made on a per-document basis. That course-grain, all-or-nothing approach to accepting and processing inputs is very limiting. Suppose, for example, that the XML document contains data for 1000 Books and all the Books are valid except one. Rejecting the entire document because of one bad Book is terribly inefficient.
>
> I wrote a paper that shows how to validate -- in an automated fashion -- each item of an XML document and make validation decisions on a per-item basis rather than a per-document basis. The items to be validated may be as fine-grain or as course-grain as desired: the item may be a single element (or even just the text within a leaf element) or the item may be an element composed of many descendent elements. Validation may be done from the outside in (the root element to the leaf nodes) or the inside out (the leaf nodes to the root element).
>
> I call the technique presented in the paper: fine-grain validation.
>
> By using the technique, your XML Schemas change from being a large template that XML documents must conform to, to a collection of rules from which we can pick and choose.
>
> Here is the paper: http://www.xfront.com/Fine-Grain-Validation.pdf
>
> /Roger
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]