xml-dev - RE: [xml-dev] Xerces, schema caching, and namespaces

RE: [xml-dev] Xerces, schema caching, and namespaces

[ Lists Home | Date Index | Thread Index ]

To: "'Daniel McLean'" <daniel@mds.rmit.edu.au>,<xml-dev@lists.xml.org>
Subject: RE: [xml-dev] Xerces, schema caching, and namespaces
From: "Michael Kay" <mike@saxonica.com>
Date: Wed, 22 Dec 2004 11:41:14 -0000
In-reply-to: <20041222044241.GL13796@io.mds.rmit.edu.au>
Thread-index: AcTn4PCwYbMtG5dzRgmErbmJGoBxwQAOVPUA

It's very much a design assumption in XML schema that a namespace has only
one schema. 

It's a slightly odd assumption really, because it's at variance with another
design principle of XML Schema, which is that the same document can be
validated against different rules depending on the user's preferences - for
example the sender of a document might apply stronger validation than the
recipient. But the assumption is there.

The assumption seems to be less strong in the case of the not-a-namespace,
otherwise facilities like chameleon schemas wouldn't be provided. But it's
still there. You get into trouble, for example, if you try to do a
schema-aware transformation or query from one no-namespace schema to a
different no-namespace schema.

I've adopted the same caching principles for schemas in Saxon. I'm not
entirely happy with the effects this causes, but I don't see very much
alternative.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Daniel McLean [mailto:daniel@mds.rmit.edu.au] 
> Sent: 22 December 2004 04:43
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] Xerces, schema caching, and namespaces
> 
> The Xerces-C++ parser has the capability of caching grammars for
> subsequent reuse.  Depending on the complexity of the grammar and the
> instance documents, doing can give a significant performance boost.
> However, the way W3C Schema grammars are cached seems a bit 
> strange to me.
> 
> All "no-namespace" schemas are considered equivalent: a no-namespace
> schema is stored in the pool of cached grammars using the key "".
> This has ... problematic effects.
> 
> Rather than invent a new example, I'll pinch one from the 
> Xerces mailing
> list:
> > First we parse a document based on schema A with root 
> element A_root.
> > The schema is cached on "". Everything is fine.
> > Then we parse another document based on schema A. The cache 
> finds the
> > schema for "" and validates. Everything is fine.
> > THEN we parse a document based on schema B with root element B_root.
> > The parser looks in the cache, finds the schema for "" 
> (type A) and validates.
> > This of course results in a shitload of errors and a failed parse.
>  [from 
> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107598912614145&w=2]
> 
> To me, this behaviour seems wrong.  However, the Xerces folk 
> think that
> it's right:
> > you shouldn't use schema caching if you have different 
> schemas sharing the 
> > same namespace (being this the empty one or not). A 
> namespace URI is a 
> > "domain", is like saying "when I am talking about music a record is 
> > something that has songs in it; when talking about sports a 
> record is the 
> > best performance". You are using two schemas, sharing the 
> same "domain" 
> > label: nothing wrong with that, provided that you don't mix them.
>  [from 
> http://marc.theaimsgroup.com/?l=xerces-c-dev&m=107599141217514&w=2]
> 
> What's the right answer?
> 
> A additional but related question: is Xerces right to cache 
> W3C Schemas
> that _do_ target namespaces based on the target namespace of 
> the schemas?
> For that to be correct, the target namespace of the schema must be
> considered to play an equivalent role to a DTD's PUBLIC identifier ...
> which doesn't seem unreasonable, but may not be true.
> 
> Daniel
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
> 
>

References:
- Xerces, schema caching, and namespaces
  - From: Daniel McLean <daniel@mds.rmit.edu.au>

Prev by Date: Re: [xml-dev] Xerces, schema caching, and namespaces
Next by Date: RE: [xml-dev] XML and entropy, again
Previous by thread: Re: [xml-dev] Xerces, schema caching, and namespaces
Next by thread: Re: [xml-dev] Xerces, schema caching, and namespaces
Index(es):
- Date
- Thread