[
Lists Home |
Date Index |
Thread Index
]
- To: =?utf-8?Q?Morten_Gr=C3=B8tan?= <Morten.Grotan@electricfarm.no>,<xml-dev@lists.xml.org>
- Subject: RE: [xml-dev] Validating documents against XML Schema with different namespace using MSXML
- From: "Dare Obasanjo" <dareo@microsoft.com>
- Date: Mon, 30 Sep 2002 02:03:38 -0700
- Thread-index: AcJoW466I4zWkyLbT4e5w3vmK2NZbgAA4f4E
- Thread-topic: [xml-dev] Validating documents against XML Schema with different namespace using MSXML
Calling the validate( ) method on the document is the preferred way to use MSXML to validate a document since it does strict validation while loading with validateOnParse set to true does lax validation.
I could not reproduce the results of your scenarios using MSXML 4.0 SP1 when I called validate() directly. Below are the code, XML instance and schema I used
var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
var xsdCache = new ActiveXObject("Msxml2.XMLSchemaCache.4.0");
xsdCache.add("", "example.xsd");
xmlDoc.async = false;
xmlDoc.schemas = xsdCache;
xmlDoc.load("example.xml");
var err = xmlDoc.validate();
if (err.errorCode == 0){
WScript.Echo("Document is valid");
}else{
WScript.Echo("Validation error:" + err.reason);
}
<?xml version="1.0" ?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="A">
<xs:complexType mixed="true">
<xs:choice>
<xs:element name="B" />
<xs:element name="C" />
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
<?xml version="1.0"?>
<A xmlns="http://www.example.com" >
foo <C /> foo
</A>
-----Original Message-----
From: Morten Grøtan [mailto:Morten.Grotan@electricfarm.no]
Sent: Mon 9/30/2002 1:29 AM
To: XML Dev List (xml-dev@lists.xml.org)
Cc:
Subject: [xml-dev] Validating documents against XML Schema with different namespace using MSXML
Hi,
I've only just started using XML Schemas with the MSXML parser (4.0 SP1), and have trouble grasping a few baseline things.
The real-life task I set out to solve with XML Schemas was to be able to validate documents prior to importing to a production database. The documents are received as part of an automatic collector service, hence I don't have 100% control of what kind of files are received. I want to be able to only import 100% valid documents, according to my schema, and avoid having tons of custom validation logic within my import program.
Making the schema itself for a given instance of a typical valid document has proved to be no problem, and I am also able to enforce the constraints specified in my schema. So far so good. But what happens is that I purposely place an invalid document among the other valid ones, to see what happens.
The real problem, I've found, is in the use of namespaces (what a surprise, having read lots of different articles about namespaces lately....). I have defined both a default and a target namespace in my schema and everything works just fine as long as I have the same namespace in my instance document. But as soon as there is the slightest mismatch in namespaces, the document is NOT validated against the schema, hence rendering my validation mechanism utterly useless.
I've sketched out 4 scenarios below, using the following terminology:
xsd: An XML Schema
xml: An XML instance document
xmlns: Default namespace
targetns: XML Schema target namespace
Scenario 1:
xsd: no xmlns and no targetns
xml: no xmlns
uses xsd and correctly says invalid xml is invalid
Scenario 2:
xsd: no xmlns and no targetns
xml: xmlns set to something else than targetns
does NOT use xsd and incorrectly says invalid xml is valid
Scenario 3:
xsd: xmlns and targetns
xml: xmlns
uses xsd and correctly says invalid xml is invalid
Scenario 4:
xsd: xmlns and targetns
xml: no xmlns or xmlns set to something else than targetns
does NOT use xsd and incorrectly says invalid xml is valid
The way I've implemented this is using the MSXML2.XMLSchemaCache.4.0 object to add a schema, associating it with the namespace defined in the schema, and then setting the "schemas" property of the XMLDom object (MSXML2.DOMDocument.4.0) to point at the XMLSchemaCache object. Then I just call the load method of the XMLDom object, and check the parseError object of the XMLDom object for any errors.
So, why does this happen? Shouldn't MSXML be able to find out that my instance document does not share the same namespace as my schema, hence invalidating the document? What is the purpose of the whole XML Schema shebang elsewise? If you can't be sure that invalid documents are in fact invalidated, without writing tons of validation code yourself, why even bother using a schema? Needless to say, I'm slightly disillusioned right now, and possibly soon will be having a problem with my boss, trying to explain why I've wasted so much time on this schema thing without producing a useful validation mechanism.....
As for the XMlSchemaCache object, I'm aware that you can add several schemas into it before attaching it to the XMLDom object, hence using all the schemas to validate the instance document. Admittedly it can seem hard to see which (if any) schema should be responsible for invalidating the document, but in my opinion ANY of them should be able to do that, as long as they can't be validated against the document. So, if at least one of the added schemas fail, the whole validation should fail. Or am I totally naïve?
Mvh / Kind regards,
Morten Grøtan
morten.grotan@electricfarm.no
Utvikler, MCSD
Electric Farm ASA, Rosenkrantz gt. 21, NO-0160 OSLO, Norway, http://www.electricfarm.no/ <http://www.electricfarm.no/>
mob: +47 92 88 59 72, tel: +47 48 00 09 99, fax: +47 23 10 30 10
|