  • From: ht@cogsci.ed.ac.uk (Henry S. Thompson)
  • To: John Aldridge <john.aldridge@informatix.co.uk>
  • Date: 06 Jan 2000 13:49:02 +0000

John Aldridge <john.aldridge@informatix.co.uk> writes:

> At 23:18 05/01/00 +0000, ht@cogsci.ed.ac.uk (Henry S. Thompson) wrote:
> >John Aldridge <john.aldridge@informatix.co.uk> writes:
> >> I'd hoped to find a statement such as "a general-purpose schema-aware
> >> processor must provide some catalogue facility which allows the
> >> specification of a location from which to fetch the schema corresponding to
> >> an NS URI.  Only in the absence of such a catalogue entry may the processor
> >> attempt to dereference the URI given by the schemaLocation attribute".
> >
> >As I've tried to convey in other messages in this and related threads, 
> >the XML Schema design is VERY concerned with precisely the issue you
> >raise above, namely, schema validation should not be a hostage to
> >connectivity and/or URL stability.  Our approach was, however, NOT to
> >design YACM (Yet Another Catalog Mechanism), but allow for ANY
> >alternative schema location mechanism which people come up with.  I
> >hope a careful reading of chapter 4 of the PWD [1] will clarify this
> >for you.
> I did carefully read Chapter 4, honest, but still struggled to understand
> the way the flexibility it includes should be used.  Note that I did not
> suggest above that the document should include a specific catalogue design;
> just that I'd hoped it would mandate the existence of _some_ catalogue.
> >For myself, I envisage schema validators working the in a similar way
> >to XT, James Clark's XSLT implementation: you will be able to invoke a
> >schema validator with explicit specification of the schema(s) you wish
> >applied,
> By which you mean (I think) "explicit specification of _how to locate_ the
> schema(s) you wish applied,".  Presumably you are not intended to be able
> to request that elements be validated against a schema with a
> targetNamespace which does not match the namespace from which the elements
> to be validated are drawn?

Both points correct:  how to _locate_, and targetNamespaces must
always match (except in the case where there is none, but that's
another can of worms).

> >         or you can leave it to the validator (Not an option XT
> >provides).  The XML Schema PWD allows for one, the other, or both, but
> >observes that only the schemaLocation approach gives interoperability
> >(at the price of fragility).
> OK, that's very helpful.  So, when writing an XML file, I should start it:
> <?xml version="1.0">
> <stuff
>    xmlns="http://www.informatix.co.uk/Stuff"
>    xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
>    xsi:schemaLocation="http://www.informatix.co.uk/Stuff
>       http://www.informatix.co.uk/Stuff/Stuff.xsd"
> >
> :
> </data>
> And then say to the customers for this data: 
>    You must process this data either
>    (a) in an environment with reliable access to
>        http://www.informatix.co.uk/Stuff/Stuff.xsd (in which case you
>        may use any "general-purpose schema-aware" XML processor), or,
>    (b) you are constrained to use only those XML processors which
>        allow you to specify that the schema for the namespace
>        http://www.informatix.co.uk/Stuff is to be found in some other
>        location accessible to you.


> In the context of the obligation "...unless directed otherwise
> general-purpose schema-aware processors must attempt to dereference each
> schema URI...", the existance of a catalogue or other mechanism for
> locating a schema counts as "directed otherwise".

Well, not the existence alone, but the existence plus some indication, 
from user or application choice, to use what exists.

> I guess I'm just suspicious that, in the absence of specific requirements,
> processors will not bother to implement an such alternative mechanism.
> After all, the language quoted in the previous paragraph is very similar to
> that describing DTD links:  "An XML processor ... may use the public
> identifier to try to generate an alternative URI.  If the processor is
> unable to do so, it must use the URI specified in the system literal".

You can't make people provide interoperable solutions, only encourage
them to do so, you're right.

> . . .
> I guess I was really confused about the relation between schemas and
> namespaces.
> I understand your answer to mean that by using a name from a namespace, and
> then using a schema-aware processor, you are automatically claiming that
> the element conforms to the schema for that namespace.
> There is no such thing, to a schema-aware processor, as a namespace without
> an associated schema.

That's close, but there are undoubtedly some grey areas.  In the
simplest case: a schema-validator is validating the content of some
element with a schema for its namespace and encounters an element name
from a different namespace.   What happens?  If neither schemaLocation 
nor built-in information nor namespace-URI-based search yield a
schema, there is a problem.  Let's look a little harder at how this
could happen.

1) The instance looks like this

  <a:root xmlns:a='uri:a' xmlns:b='uri:b'>
   <a:a ...>...</a:a>
   <b:b ...>...</b:b>

  The content model the validator is working with, within a schema for
  the uri:a namespace, looks like this:

  <element ref='a' . . ./>
  <element ref='o:b'/> 

  Now this latter reference is not allowed unless there's an <import>
  statement for it.  But that <import> may not contain a
  'schemaLocation' attribute, or the URI specified there may not be
  accessible, etc.  At that point an error should be raised.

2) The instance is the same, but the relevant content model looks like 

  <element ref='a' . . ./>
  <any namespace='##other'/>

  This, and related cases, are the grey area mentioned above.  The WG
  has not yet decided exactly what the detailed schema-validation story
  is wrt validation within material which in the first instance is
  allowed by a wildcard particle in a content model.

> Thanks for your help, both here and on other topics to which I've not
> contributed but have followed with interest.

You're welcome:  you, and the rest of xml-dev, are our launch
customers. . . :-)

  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

