xml-dev - Re: [xml-dev] Xerces, schema caching, and namespaces

Re: [xml-dev] Xerces, schema caching, and namespaces

[ Lists Home | Date Index | Thread Index ]

To: Neeraj.Bajaj@Sun.COM
Subject: Re: [xml-dev] Xerces, schema caching, and namespaces
From: Neeraj Bajaj <Neeraj.Bajaj@Sun.COM>
Date: Wed, 22 Dec 2004 10:53:40 +0530
Cc: Daniel McLean <daniel@mds.rmit.edu.au>, xml-dev@lists.xml.org
In-reply-to: <41C8FFA8.4040304@sun.com>
References: <20041222044241.GL13796@io.mds.rmit.edu.au><41C8FFA8.4040304@sun.com>
Reply-to: Neeraj.Bajaj@Sun.COM
User-agent: Mozilla Thunderbird 0.8 (Windows/20040913)

Ah... What i wrote is for Xerces-J but it seems you are talking about 
Xerces C++.
Don't know what is the behavior of Xerces - C but i don't think it would 
be different than
Xerces-J

Neeraj

Neeraj Bajaj wrote:

>
>
> Daniel McLean wrote:
>
>> The Xerces-C++ parser has the capability of caching grammars for
>> subsequent reuse.  Depending on the complexity of the grammar and the
>> instance documents, doing can give a significant performance boost.
>> However, the way W3C Schema grammars are cached seems a bit strange 
>> to me.
>>
>> All "no-namespace" schemas are considered equivalent: a no-namespace
>> schema is stored in the pool of cached grammars using the key "".
>> This has ... problematic effects.
>>
>>  
>>
> I worked on caching long time back.. IIRC this is the default behavior 
> which can be changed,
> You can write your own logic to specify the criteria for caching for 
> ex. targetNamespace + SchemaLocation etc.
> Default behavior of Xerces is to store the grammar using 
> targetNamespace as key.

> You can also have mutiple Grammar Pools having different type of 
> grammars but
> then you have to write logic to set appropriate pool per the instance 
> document parsed which
> may be cumbersome.. You might like to check JAXP 1.3 Schema Validation 
> Framework
> which looks at the caching behavior in entirely different way.
>
> Neeraj
>
> [1] 
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/validation/package-summary.html 
>
> [2] https://jaxp.dev.java.net
>
>> Rather than invent a new example, I'll pinch one from the Xerces mailing
>> list:
>>  
>>
>>> First we parse a document based on schema A with root element A_root.
>>> The schema is cached on "". Everything is fine.
>>> Then we parse another document based on schema A. The cache finds the
>>> schema for "" and validates. Everything is fine.
>>> THEN we parse a document based on schema B with root element B_root.
>>> The parser looks in the cache, finds the schema for "" (type A) and 
>>> validates.
>>> This of course results in a shitload of errors and a failed parse.
>>>   
>>
>> [from 
>> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107598912614145&w=2]
>>
>> To me, this behaviour seems wrong.  However, the Xerces folk think that
>> it's right:
>>  
>>
>>> you shouldn't use schema caching if you have different schemas 
>>> sharing the same namespace (being this the empty one or not). A 
>>> namespace URI is a "domain", is like saying "when I am talking about 
>>> music a record is something that has songs in it; when talking about 
>>> sports a record is the best performance". You are using two schemas, 
>>> sharing the same "domain" label: nothing wrong with that, provided 
>>> that you don't mix them.
>>>   
>>
>> [from 
>> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107599141217514&w=2]
>>
>> What's the right answer?
>>
>> A additional but related question: is Xerces right to cache W3C Schemas
>> that _do_ target namespaces based on the target namespace of the 
>> schemas?
>> For that to be correct, the target namespace of the schema must be
>> considered to play an equivalent role to a DTD's PUBLIC identifier ...
>> which doesn't seem unreasonable, but may not be true.
>>
>> Daniel
>>
>> -----------------------------------------------------------------
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://www.oasis-open.org/mlmanage/index.php>
>>
>>  
>>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
>

References:
- Xerces, schema caching, and namespaces
  - From: Daniel McLean <daniel@mds.rmit.edu.au>
- Re: [xml-dev] Xerces, schema caching, and namespaces
  - From: Neeraj Bajaj <Neeraj.Bajaj@Sun.COM>

Prev by Date: Re: [xml-dev] Xerces, schema caching, and namespaces
Next by Date: RE: [xml-dev] Xerces, schema caching, and namespaces
Previous by thread: Re: [xml-dev] Xerces, schema caching, and namespaces
Next by thread: RE: [xml-dev] Xerces, schema caching, and namespaces
Index(es):
- Date
- Thread