[
Lists Home |
Date Index |
Thread Index
]
Ah... What i wrote is for Xerces-J but it seems you are talking about
Xerces C++.
Don't know what is the behavior of Xerces - C but i don't think it would
be different than
Xerces-J
Neeraj
Neeraj Bajaj wrote:
>
>
> Daniel McLean wrote:
>
>> The Xerces-C++ parser has the capability of caching grammars for
>> subsequent reuse. Depending on the complexity of the grammar and the
>> instance documents, doing can give a significant performance boost.
>> However, the way W3C Schema grammars are cached seems a bit strange
>> to me.
>>
>> All "no-namespace" schemas are considered equivalent: a no-namespace
>> schema is stored in the pool of cached grammars using the key "".
>> This has ... problematic effects.
>>
>>
>>
> I worked on caching long time back.. IIRC this is the default behavior
> which can be changed,
> You can write your own logic to specify the criteria for caching for
> ex. targetNamespace + SchemaLocation etc.
> Default behavior of Xerces is to store the grammar using
> targetNamespace as key.
> You can also have mutiple Grammar Pools having different type of
> grammars but
> then you have to write logic to set appropriate pool per the instance
> document parsed which
> may be cumbersome.. You might like to check JAXP 1.3 Schema Validation
> Framework
> which looks at the caching behavior in entirely different way.
>
> Neeraj
>
> [1]
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/validation/package-summary.html
>
> [2] https://jaxp.dev.java.net
>
>> Rather than invent a new example, I'll pinch one from the Xerces mailing
>> list:
>>
>>
>>> First we parse a document based on schema A with root element A_root.
>>> The schema is cached on "". Everything is fine.
>>> Then we parse another document based on schema A. The cache finds the
>>> schema for "" and validates. Everything is fine.
>>> THEN we parse a document based on schema B with root element B_root.
>>> The parser looks in the cache, finds the schema for "" (type A) and
>>> validates.
>>> This of course results in a shitload of errors and a failed parse.
>>>
>>
>> [from
>> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107598912614145&w=2]
>>
>> To me, this behaviour seems wrong. However, the Xerces folk think that
>> it's right:
>>
>>
>>> you shouldn't use schema caching if you have different schemas
>>> sharing the same namespace (being this the empty one or not). A
>>> namespace URI is a "domain", is like saying "when I am talking about
>>> music a record is something that has songs in it; when talking about
>>> sports a record is the best performance". You are using two schemas,
>>> sharing the same "domain" label: nothing wrong with that, provided
>>> that you don't mix them.
>>>
>>
>> [from
>> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107599141217514&w=2]
>>
>> What's the right answer?
>>
>> A additional but related question: is Xerces right to cache W3C Schemas
>> that _do_ target namespaces based on the target namespace of the
>> schemas?
>> For that to be correct, the target namespace of the schema must be
>> considered to play an equivalent role to a DTD's PUBLIC identifier ...
>> which doesn't seem unreasonable, but may not be true.
>>
>> Daniel
>>
>> -----------------------------------------------------------------
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://www.oasis-open.org/mlmanage/index.php>
>>
>>
>>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
>
|