OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Bulk XSD validation in Java

[ Lists Home | Date Index | Thread Index ]

Title: Bulk XSD validation in Java
If you chose to tackle this job using Saxon, you could do it as follows. Create a Saxon Configuration object, which acts as a container for a schema cache. Then write some kind of listener that listens for the events indicating a validation request (or do it as an HTTP servlet). When a request arrives, fire off a new thread to do the validation: you can use the standard JAXP validation interface, or Saxon's internal interface if you want more control. If you need to, you can write a URIResolver that interprets the schemaLocation attribute, giving the option to redirect the URIs. If a document requests a schema that is already loaded in the cache, then it will be used from the cache; otherwise it will be fetched from disk, compiled, and added to the cache.
 
The only limitations with this approach are that (a) the cache can only contain one schema component with any given name (which may be a problem if you have many unrelated no-namespace schemas), and (b) a newly-loaded schema isn't allowed to "modify" definitions that have already been used. "Modifications" here includes using xs:redefine, adding to a substitution group, or extending a complex type. You can usually get around the "modifying" restrictions by preloading schemas into the cache rather than adding them incrementally as they are first encountered.
 
Michael Kay


From: Chris Wilper [mailto:cwilper@cs.cornell.edu]
Sent: 28 February 2006 01:54
To: xml dev
Subject: [xml-dev] Bulk XSD validation in Java

Hi all,

I've got a java process that needs to continously validate xml documents according to the w3c schemas they indicate in their xsd:schemaLocations.  The documents arrive at a high rate and must be processed as quickly as possible.  The exact schemas they employ are not known ahead of time and there may be several of them required to validate each document.

My question is, what library/libraries are appropriate in this situation and how do I tell them to only load the required schema(s) only once?  Any advice?

Thanks,
Chris





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS