Lists Home |
Date Index |
Ah... What i wrote is for Xerces-J but it seems you are talking about
Don't know what is the behavior of Xerces - C but i don't think it would
be different than
Neeraj Bajaj wrote:
> Daniel McLean wrote:
>> The Xerces-C++ parser has the capability of caching grammars for
>> subsequent reuse. Depending on the complexity of the grammar and the
>> instance documents, doing can give a significant performance boost.
>> However, the way W3C Schema grammars are cached seems a bit strange
>> to me.
>> All "no-namespace" schemas are considered equivalent: a no-namespace
>> schema is stored in the pool of cached grammars using the key "".
>> This has ... problematic effects.
> I worked on caching long time back.. IIRC this is the default behavior
> which can be changed,
> You can write your own logic to specify the criteria for caching for
> ex. targetNamespace + SchemaLocation etc.
> Default behavior of Xerces is to store the grammar using
> targetNamespace as key.
> You can also have mutiple Grammar Pools having different type of
> grammars but
> then you have to write logic to set appropriate pool per the instance
> document parsed which
> may be cumbersome.. You might like to check JAXP 1.3 Schema Validation
> which looks at the caching behavior in entirely different way.
>  https://jaxp.dev.java.net
>> Rather than invent a new example, I'll pinch one from the Xerces mailing
>>> First we parse a document based on schema A with root element A_root.
>>> The schema is cached on "". Everything is fine.
>>> Then we parse another document based on schema A. The cache finds the
>>> schema for "" and validates. Everything is fine.
>>> THEN we parse a document based on schema B with root element B_root.
>>> The parser looks in the cache, finds the schema for "" (type A) and
>>> This of course results in a shitload of errors and a failed parse.
>> To me, this behaviour seems wrong. However, the Xerces folk think that
>> it's right:
>>> you shouldn't use schema caching if you have different schemas
>>> sharing the same namespace (being this the empty one or not). A
>>> namespace URI is a "domain", is like saying "when I am talking about
>>> music a record is something that has songs in it; when talking about
>>> sports a record is the best performance". You are using two schemas,
>>> sharing the same "domain" label: nothing wrong with that, provided
>>> that you don't mix them.
>> What's the right answer?
>> A additional but related question: is Xerces right to cache W3C Schemas
>> that _do_ target namespaces based on the target namespace of the
>> For that to be correct, the target namespace of the schema must be
>> considered to play an equivalent role to a DTD's PUBLIC identifier ...
>> which doesn't seem unreasonable, but may not be true.
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://www.oasis-open.org/mlmanage/index.php>
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>