OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are we losing out because of grammars?

Norman Walsh wrote:

> / James Clark <jjc@jclark.com> was heard to say:
> | It's not in general easy, unless you restrict the grammar.  For example,
> | consider the following TREX pattern:
> I'm confused in a couple of ways.
> | <element name="x">
> |   <zeroOrMore>
> |     <element name="y">
> |       <attribute name="z">
> |         <data type="xsd:string"/>
> |       </attribute>
> |     </element>
> |   </zeroOrMore>
> |   <element name="y">
> |     <data type="xsd:integer"/>
> |   </element>
> | </element>
> |
> | If I'm in an "x" element and I get a "y" element with a "z" attribute
> | that is a legal lexical representation of an integer, I can't tell
> | whether to type that attribute as an "xsd:integer" or an "xsd:string"
> There's only one z attribute in your example, did you mean for both y's
> to have z attributes with different types?

Yes, sorry.

> | unless I lookahead and see whether it's the last element "y" element in
> | the "x".   The TREX implementation works on a stream of SAX events, so
> | this is a big complication.
> I'm a little confused by this example. I would have thought that the
> validator had to look ahead anyway in this case.

Actually, it doesn't.  This makes implementation a little bit more
interesting than DTD content models.

> I thought that the model was to find a matching TREX element
> definition for each element in the instance. If you don't look ahead
> to see if you've got the last y, how can you pick the matching
> definition?

It doesn't have to find a matching TREX element pattern, it merely has
to determine that there is at least one TREX element pattern that
matches; if there's more than one, it doesn't have to determine which. 
When it sees the "y" it will find that there are two ways it could have
matched; it remembers that and proceeds accordingly.  If it sees another
"y", it determines that one of those two ways is no longer a