OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Noob: XML validation capabilities beyond schemas

On Fri, 2011-12-09 at 10:04 -0800, John Christopher wrote:

> So far, I have been validating my XML docs with RELAX NG schemas (I
> started out with XML Schema but switched to RELAX NG, which is much
> easier to read).  I use the XML schema datatypes, enumeration, lists,
> etc. to do most of the validation I want to do.  When I need to
> validate something that is beyond the capabilities of RELAX NG, I have
> been using GNU Awk to scan the XML and do further validation, but this
> is messy and error-prone.

If your additional needs are met by W3C XML Schema (XSD), that would be
the way to go these days.  You can (for example) use regular
expressions, length and value constraints, and so on. Since you mention
awk I'll note that xsltproc has some XSD support, as does xerces.

The advantage of using XSD is that you could then also use XSLT 2 and/or
XQuery more easily, e.g. with Saxon or BaseX.

> I am also investigating schematron.  How powerful/limited is this?

In some ways it's _too_ powerful (like awk). But the ability to say
things like, if the @other attribute is present then there must be a
"reason" subelement, is for sure useful; XSD 1.1 adds some of this, but
I don't know of any plans for xmllint/libxslt to support XSD 1.1.

> What is/are the standard way(s) to validate XML beyond the capabilities of schemas? 

Once you get beyond pure text processing you may need an
application-specific layer that happens after XSD (or RelaxNG)
processing. You could use xproc to specify a "validation pipeline" of
lots of small steps, if you wanted.

An example might be,@ticker attributes must be valid stock exchange
ticker symbols at the time of validation, actively trading on the @exch
exchange; fill in @sellprice, @buyprice, @quotetime and <companyName>
based on this information; this example requires some sort of database
lookup probably, or a Web service, and goes beyond normal schema
validation, but is very much a plausible part of domain-specific
validation. I'd be likely to use either XSLT or XQuery for that one.


Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS