Lists Home |
Date Index |
- From: Richard Goerwitz <email@example.com>
- To: firstname.lastname@example.org
- Date: Sun, 13 Sep 1998 16:15:09 -0400
Philippe Le Hégaret wrote:
> > Is (paragraph*)* a deterministic content model ?
> > If yes, so I think (a+ | b)* is a deterministic content model too.
> > >
> > > it is an error if an element in the document can match more
> > > than one occurrence of an element type in the content model.
> I'm not totally agree with you, because if you write the
> sequence like this:
> (a, a*)*
> is it still deterministic ? For me no, because there are
> two states in this content model. (a+)* is the same case and
> (a+ | b)* too.
Looks like everybody is more or less correct.
The whole point of flagging nondeterministic content models (which
is what SGML did, and XML may optionally do) is that nondetermin-
istic content models often indicate logic errors by the writer.
Put somewhat differently, if a DTD writer composes a content model
that allows a given sequence of elements to be processed in more
than one way, this often indicates an error.
So, for example, with (a, a*)*, it's hard to imagine what is
intended, because a single <a/><a/> could match two instances of
(a, a*), or one instance if (a, a*), depending on how you go
through the automaton. Processors may, incidentally, flag (a+)*
as "ambiguous", since a+ usually implemented as (a, a*).
Such ambiguities create unintended differences in how the same
input might be processed by different software. Or they simply
lead to the input being processed in a way the surprises the user
(or worse yet, the programmer).
That's why I think it's a good idea for validators, in particular,
to flag "ambiguous" content models aggressively.
To test these sorts of things is easy enough. Just make up a toy
DTD and run it through a good validator. Take, for example, the
following (where elements x, y, and z should get flagged as "am-
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ELEMENT a EMPTY>
<!ELEMENT b EMPTY>
<!ELEMENT w (a*)*>
<!ELEMENT x (a+ | b)*>
<!ELEMENT y (a, a*)*>
<!ELEMENT z (a+, b?, a+)>
Yes, as always, you can try this out with the validator at:
PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.): finger email@example.com
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)