OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Checking for elements

[ Lists Home | Date Index | Thread Index ]

At 2004-05-25 11:44 -0400, Noah Genner wrote:
>  I need to be able to check an XML document for some specific elements, 
> and if
>possible produce a little report on which ones are missing. The document has
>already been validated against a dtd/schema that contains approximately 270
>elements, but I need to check that the ~40 I need are present. If they aren't
>present I want to produce a little report that says which elements are not 
>  Can I do this with XSLT, or is there an easier way to do this?

You could do it with XSLT or with Schematron as both of these check 
business rules.

But since what you are doing is more like a grammatical check with your own 
constraints, than a business rule check with algorithmic properties, I 
would lean towards you writing your own grammar and just checking your 
instance against both the "official" grammar in the DTD/Schema and then 
against your "additional constraints" grammar.  My personal choice is using 
ISO/IEC 19757-2 RELAX-NG for writing your own "mini-grammar".

In fact, you could even relax the content constraints on your mini-grammar 
if you wanted to use it to supplement the "official" grammar: just use the 
mini-grammar to check for the presence of elements and attributes as being 
simple text values, letting the "official" grammar check the elements and 
attributes for being particular data types.

I'm not sure which would be more "politically correct" so I'm curious what 
others on this list would feel about the difference between these two 
approaches:  consider that I have a W3C Schema expression with many derived 
and explicit data types with value constraints up the wazoo, but I want to 
check that only a subset of those constraints have been met (that subset 
containing at the least all of the mandatory constructs):

(1) - write a replacement ISO/IEC 19757-2 RELAX-NG schema mimicking all of 
the W3C data types for all of the value constraints, thereby negating the 
need for the W3C Schema expression for any purpose


(2) - write a supplemental ISO/IEC 19757-2 RELAX-NG schema using only text 
value constraints, thereby requiring "primary validation" by the official 
W3C Schema expression and checking only "business requirement presence 
constraints" with the RELAX-NG schema

Personally I would lean to (2) so as to not "compete" with the W3C Schema 
expression or need to play catch-up with any changes to the W3C Schema 

What would others do when needing to validate a subset of a published W3C 
Schema expression maintained by a third party?

........................... Ken

Public courses: Spring 2004 world tour of hands-on XSL instruction
Next: 3-day XSLT/XPath; 2-day XSL-FO - Birmingham, UK June 14,2004

World-wide on-site corporate, govt. & user group XML/XSL training.
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/x/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Breast Cancer Awareness  http://www.CraneSoftwrights.com/x/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS