xml-dev - Re: FW: [xml-dev] Relax NG annoyances

Re: FW: [xml-dev] Relax NG annoyances

[ Lists Home | Date Index | Thread Index ]

To: jimbolist@hotmail.com
Subject: Re: FW: [xml-dev] Relax NG annoyances
From: James Clark <jjc@jclark.com>
Date: Thu, 03 Jul 2003 12:29:35 +0700
Cc: xml-dev@lists.xml.org
In-reply-to: <000101c340ea$1802d6b0$0100a8c0@picard>
References: <000101c340ea$1802d6b0$0100a8c0@picard>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225


>>>Why aren't defined patterns named with QNames? ...
>>
>>How would this aid portability?
> 
> 
> I'll give an (trivial) example that demonstrates.  Say I want to
> validate against a custom version of XHTML that supports XLink.  I
> include James Clark's Relax NG Schema for XHTML [1] and John Cowan's RNG
> for XLink [2].  Both schemas are OK by themselves, however, including
> them would give a conflict with start definitions and "title"
> definitions.  The start definition would be expected; however, the title
> definition is not obvious - it's defined in a small module [3] among
> *many* definitions [4].  If I was combining two large languages (such as
> XHTML and MathML), then it would be hard to find all of the conflicting
> names.

Combining independently created schemas is an important requirement, but 
I don't think using QNames for definitions is the solution.

For a start, note that using QNames for definitions wouldn't solve your 
XHTML+XLink problem for two reasons.  First, the XHTML schema is written 
in a closed way and doesn't allow attributes from foreign namespaces. 
Second, an instance needs to be validated concurrently against both the 
XLink schema and the XHTML schema.  What you need is a language that can 
say: to validate this instance, use the XLink schema to validate the 
complete instance and also use the XHTML schema to validate the instance 
after stripping out attributes from the XLink namespace. NRL [1] is 
designed to do this sort of thing.  RELAX NG isn't intended to solve 
every validation problem by itself.

For a case like combining XHTML and MathML, RELAX NG already provides a 
solution to avoid conflicting definition names.  The solution is nested 
grammars.  In RELAX NG, the grammar element is allowed anywhere not just 
at the top level.  When a grammar element occurs inside another grammar 
element, the definitions in the nested grammar element are local to that 
grammar and don't conflict with the definitions in the containing 
grammar.  Thus to make a XHTML+MathML schema, you can just do something 
like:

<grammar>
<include href="xhtml.rng"/>
<define name="Inline.class" combine="choice">
   <externalRef href="mathml.rng"/>
</define>
</grammar>

In RELAX NG, <include> merges grammars, whereas <externalRef> is 
replaced by the pattern in the referenced file.  So in this case, the 
result will be that MathML grammar will be nested inside the XHTML 
grammar, so the names in the MathML grammar will not conflict with the 
names in the XHTML grammar.  RELAX NG also provides a <parentRef> 
element that allows nested grammars to selectively reference definitions 
from their parent grammars.  It's a little like the block scoping found 
in many programming languages.

There are a couple of reasons I don't think QNames for definitions are a 
good idea:

- URIs and QNames make sense for global objects, but definitions are not 
global; they're local to a particular grammar.  It would be a bit like 
using package names for local variables.

- Some XML users make heavy use of XML namespaces; other XML users think 
they are an unnecessary complication don't want to have anything to do 
with them.  RELAX NG is designed to work well for both kinds of user. 
(In contrast, NRL doesn't have this goal.)  However, the need to combine 
schemas is not unique to XML namespace users.  I might want to use a 
math schema inside a document schema and not use XML namespaces for 
either.  Using QNames doesn't seem like a natural solution in a 
non-namespace environment.

I think there is scope for improvement in RELAX NG in this area. At the 
moment, the only mechanisms for referencing grammars are <include> and 
<externalRef>.  These mechanism both work by syntactically including the 
referenced grammar.  This doesn't work for mutually recursive grammars. 
By contrast, the <ref> element can handle mutually recursive 
definitions, because it doesn't work at a purely syntactic level: it 
doesn't just replace the <ref> element by the content of the referenced 
definition. I think there's scope for a grammar reference mechanism that 
doesn't work just by syntactic inclusion; but how exactly it should work 
is something I haven't yet figured out.

James

[1] http://www.thaiopensource.com/relaxng/nrl.html

References:
- FW: [xml-dev] Relax NG annoyances
  - From: "Jimmy Cerra" <jimbolist@hotmail.com>

Prev by Date: Re: [xml-dev] Allowed PEReference usages
Next by Date: Re: [xml-dev] attribute abstract
Previous by thread: Re: FW: [xml-dev] Relax NG annoyances
Next by thread: RE: [xml-dev] Relax NG annoyances
Index(es):
- Date
- Thread