XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] [Summary] Big hierarchy of XML Schema documents ...which XML Schema validators bring in the documents on demand?

Sorry for the imprecision. I am trying to differentiate between real and perceived issues with XML document validation as it pertains to memory requirements and processing time associated with validating against an XSD that has lots of schema dependencies.  On the one hand, there is heavy emphasis on using standards, such as OGC's GML, and also reusing XML types defined in a specific set of schemas when developing a new schema, and on the other hand there is a need to publish conforming XML documents as quickly and leanly as possible.  

The confidence level is not high with respect to how XML validation works under these conditions.  For instance, there have been cases where one XML development tool reported a schema to be invalid while another tool deemed it valid. It was difficult to understand why the perceived error (yes, the tool had a bug) was even encountered during the XML document validation process because none of the components of that schema were used by the root schema.  But one can imagine instituting procedures that would prevent this issue from recurring.  But is there a way to avoid the additional processing time and memory use associated with loading in the extra schemas and validating them when you just want to validate XML instance documents?  

Thanks,
Nora

-----Original Message-----
From: Michael Kay [mailto:mike@saxonica.com] 
Sent: Thursday, August 05, 2010 6:15 PM
To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] [Summary] Big hierarchy of XML Schema documents ... which XML Schema validators bring in the documents on demand?

I'm not sure I really understand the question. That's partly because 
your terminology isn't precise. The processor constructs a schema by 
loading schema documents. The schema is a set of schema components 
derived from the schema documents. There are constraints that schema 
documents must satisfy, and there are constraints that schema components 
must satisfy, and there are constraints on the schema as a whole (e.g. 
no two element declarations with the same name). Loosely, you can talk 
about a schema document being valid if it satisfies the constraints on 
schema documents, and you can talk about schemas being valid if all the 
components satisfy the constraints on components (though some people 
prefer not to entertain the notion that the set of all schemas includes 
some that are valid and some that are invalid).

I don't understand your idea that a "schema" might be considered valid 
by one validator and not by another. Perhaps I don't even understand 
what you mean by a "validator". It's true of course that a schema can be 
invalid because of inconsistency between its parts (for example, two 
element declarations with the same name and different definitions) where 
each of the components is valid when considered alone. Is that what you 
had in mind?

Michael Kay
Saxonica



On 05/08/2010 17:21, Dowling, Nora M. wrote:
> Hi, all,
>
> During the validation of an XML document, is there a difference between the validator "loading a schema" and "validating the schema as it is loaded"?
>
> Of particular concern is the case where the linked-in schema is considered valid by another validator but not by the one currently validating the XML document (the schema for which includes a schema that includes the schema that the current validator thinks is invalid).  How do the various validators handle this when:
>
> 1) the XML document being validated never makes use of the schema;
> 2) it makes use only of the valid parts of the linked-in schema; and
> 3) it uses the "invalid" definition in the schema?
>
> Thanks,
>
> Nora Dowling
> The MITRE Corporation
>
> -----Original Message-----
> From: Costello, Roger L. [mailto:costello@mitre.org]
> Sent: Thursday, August 05, 2010 7:55 AM
> To: xml-dev@lists.xml.org
> Subject: RE: [xml-dev] [Summary] Big hierarchy of XML Schema documents ... which XML Schema validators bring in the documents on demand?
>
> Good. Thanks Michael. I've got it.
>
> Do the other schema validators behave in the same way?
>
> Michael Kay: does SAXON behave the same way?
>
>
> /Roger
>
>
> From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com]
> Sent: Wednesday, August 04, 2010 5:06 PM
> To: Costello, Roger L.
> Cc: xml-dev@lists.xml.org
> Subject: RE: [xml-dev] [Summary] Big hierarchy of XML Schema documents ... which XML Schema validators bring in the documents on demand?
>
> It depends on what you put in Library.xsd. If you import Book.xsd then it will get loaded with it:
>
> <xs:import namespace="http://www.book.org"; schemaLocation="Book.xsd"/>
> <xs:complexType name="BooksType">
>   <xs:sequence>
>    <xs:element xmlns:x="http://www.book.org"; ref="x:Book"/>
>   </xs:sequence>
> </xs:complexType>
>
> If the schema documents aren't directly connected:
>
> <xs:complexType name="BooksType">
>   <xs:sequence>
>    <xs:any namespace="http://www.book.org"/>
>   </xs:sequence>
> </xs:complexType>
>
> then Book.xsd won't be loaded until Xerces-J's validator hits<Book>.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Costello, Roger L."<costello@mitre.org>  wrote on 08/04/2010 04:33:41 PM:
>
>    
>> Hi Michael,
>>
>> Sorry for my misunderstanding. Let me be sure that I now understand
>> correctly.  
>>
>> Consider this XML document:
>>
>> <?xml version="1.0"?>
>> <Library xmlns="http://www.library.org";
>>           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>           xsi:schemaLocation=
>>                      "http://www.library.org
>>                       Library.xsd">
>>      <Books>
>>          <Book xmlns=http://www.book.org
>>                xsi:schemaLocation=
>>                             "http://www.book.org
>>                              Book.xsd">
>>                  <Title>My Life and Times</Title>
>>                  <Author>Paul McCartney</Author>
>>                  <Date>1998</Date>
>>                  <ISBN>1-56592-235-2</ISBN>
>>                  <Publisher>Macmillan Publishing</Publisher>
>>          </Book>
>>          ...
>>      </Books>
>> </Library>
>>
>> Xerces-J will not read Book.xsd until it gets to the<Book>  element.
>> It will read Library.xsd immediately. Is that correct?
>>
>> /Roger
>>
>>
>> From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com]
>> Sent: Wednesday, August 04, 2010 3:43 PM
>> To: xml-dev@lists.xml.org
>> Subject: Re: [xml-dev] [Summary] Big hierarchy of XML Schema
>> documents ... which XML Schema validators bring in the documents on demand?
>>
>> Roger,
>>
>> "Costello, Roger L."<costello@mitre.org>  wrote on 07/31/2010 01:50:37 PM:
>>
>>      
>>> Hi Folks,
>>>
>>> Thanks to Michael Glavassevich, Michael Kay, and Boris Kolpackov for
>>> your excellent inputs.
>>>
>>> Here's what I learned (please correct any errors):
>>>
>>> Suppose that your XML Schema imports/includes some XML Schemas, and
>>> they import/include some XML Schemas, and so on. Thus, there is a
>>> big hierarchy of XML Schema documents.
>>>
>>> When does a validator read the XML Schema documents? Here are two
>>> ways that XML Schema validators could be implemented:
>>>
>>> 1. Just-in-time loading (a.k.a. on-demand loading): the validator
>>> reads an XML Schema document during instance validation, when a
>>> component from the relevant namespace is first encountered.
>>>
>>> 2. Eager loading: all XML Schema documents are (recursively) read
>>> prior to validating the XML instance document.
>>>        
>> This is quite a different statement than what you originally had and
>> is no longer an accurate description of how / when Xerces-J
>> dynamically loads schemas. There could be multiple schema location
>> hints (i.e. xsi:schemaLocation) in a document and Xerces-J won't
>> load those schema documents unless the validator hits an element,
>> attribute or type which has the target namespace of those schemas.
>> If they don't import each other loading one of them won't cause the
>> others to be loaded. The others might be loaded later if they're needed.
>>
>>      
>>> The following XML Schema validators all use eager loading:
>>>
>>>      SAXON (Java)
>>>
>>>      SAXON (.NET)
>>>
>>>      XERCES (Java)
>>>
>>>      XERCES (C++)
>>>
>>>      XERCES (Perl)
>>>
>>>      LIBXML (Gnome's libxml2)
>>>
>>>      MSXML
>>>
>>>      XSV
>>>
>>>
>>> /Roger
>>>
>>> _______________________________________________________________________
>>>
>>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>>> to support XML implementation and development. To minimize
>>> spam in the archives, you must subscribe before posting.
>>>
>>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>>> subscribe: xml-dev-subscribe@lists.xml.org
>>> List archive: http://lists.xml.org/archives/xml-dev/
>>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>>        
>> Thanks.
>>
>> Michael Glavassevich
>> XML Parser Development
>> IBM Toronto Lab
>> E-mail: mrglavas@ca.ibm.com
>> E-mail: mrglavas@apache.org
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> subscribe: xml-dev-subscribe@lists.xml.org
>> List archive: http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>      
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
>    


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS