[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are we losing out because of grammars?

From: James Clark <jjc@jclark.com>
To: "K.Kawaguchi" <k-kawa@bigfoot.com>
Date: Tue, 30 Jan 2001 14:32:31 +0700

"K.Kawaguchi" wrote:

> And I read your tutorial that states
> 
> > The role of TREX is ... not to assist in interpretation of the
> > documents belonging to the class
> 
> (1) I think "assistance of the interpretation" is important requirement
>     of schema language (or at least I think there are solid needs).
>     But it seems to me that you don't share this view, do you. Why?

I think that

+ validation, and

+ "assistance of interpretation" or, more generally, augmenting the
infoset

are separate functions and that mushing the two together is a bad idea:
I may want to validate without augmenting the infoset and I may want to
augment the infoset without validating.

In SGML, validation was not cleanly separated from parsing.  I think
this was a significant problem with SGML: you couldn't do anything do an
SGML document without running it through a complex, validating parser. 
XML changed this with the introduction of the concept of
well-formedness, which separated out validation from parsing.  It didn't
in my view go quite far enough in this separation: certain tasks were
lumped in with validation that were logically separable. For example,
validating parsers are required to handle default attributes declared in
the external DTD, whereas non-validating parsers are not.  This has
caused nothing but trouble.  Users naturally want to get the same
results whether they are validating or not, which means they need
non-validating parsers do all the things required of validating parsers
that affect the infoset.

The lesson I draw from this is that it's better to keep these things as
well separated as possible.

> (2) If we set aside ambiguity issue, it is easy to add "assistance"
>     capability to TREX, by (for example) introducing "type" attribute to
>     each <element> and <attribute> element in the pattern.
>     Do you have any reason to discourage this?

No reason.  I see no problem with annotating TREX patterns with
additional information, and having separate processes leverage that
information to do things other than validation.  So long as things are
kept cleanly separated, this sort of reuse seems a good thing to me. 
Processes that reuse TREX patterns for purposes other than validation
may choose not to allow all TREX patterns: for example, they might
exclude certain operators (such as concur) or they might impose certain
additional constraints relating to ambiguity.

TREX has all the support for this that I think it needs: it allows you
to add new attributes and child elements to TREX elements in the
pattern, so long as they are in a separate namespace.

James

Follow-Ups:
- Re: Are we losing out because of grammars?
  - From: "K.Kawaguchi" <k-kawa@bigfoot.com>

References:
- Are we losing out because of grammars? (Re: Schema ambiguitydetection algorithm for RELAX (1/4))
  - From: Rick Jelliffe <ricko@allette.com.au>
- Re: Are we losing out because of grammars? (Re: Schemaambiguitydetection algorithm for RELAX (1/4))
  - From: James Clark <jjc@jclark.com>
- Re: Are we losing out because of grammars?
  - From: "K.Kawaguchi" <k-kawa@bigfoot.com>

Prev by Date: Re: XML Integration
Next by Date: Re: Are we losing out because of grammars?
Previous by thread: Re: Are we losing out because of grammars?
Next by thread: Re: Are we losing out because of grammars?
Index(es):
- Date
- Thread