OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Non-deterministic content model

You are correct that the rewritten version is deterministic.

It is provable that _every_ non-deterministic content model can be
converted to a deterministic content model.  The trick is basically to
manually 'unroll' the content model.  A non-deterministic content model
is non-deterministic _because_ a direct conversion of the content model
to a state diagram produces a state diagram with a single state and
multiple outgoing edges for the same match.  To make it determinisitic,
collapse the paths together.  (A formal discussion of this exists in
almost any computation theory book, and in the book "Compilers,
Principles, Techniques, and Tools" by Aho, Sethi, and Ullman)

MSXML allows non-deterministic content models in DTDs, partially for
that reason, but IBM's parser throws an error, so you shuld avoid them
for strict parser compatibility.

Derek Denny-Brown

> -----Original Message-----
> From: Steve Rosenberry
> [mailto:Steve.Rosenberry@ElectronicSolutionsCo.com]
> Sent: Wednesday, June 13, 2001 10:55 AM
> To: xml-dev@lists.xml.org
> Subject: Re: Non-deterministic content model
> Marcus Carr wrote:
> > 
> > It would be non-deterministic if it the processor was faced
> > with element x in two contexts (for example), such as:
> > 
> > <!ELEMENT a ((x, y) | (x, x))?>
> > 
> Ok, now that I think I understand deterministic vs. non-deterministic
> that leads to a couple of other questions:
> 1) Would rewriting the above as <!ELEMENT a (x,(x|y))?> convert the
> original content model to a deterministic one?  As I now 
> understand the
> rules, I believe the rewritten content model to be deterministic.  At
> 'a' I can either have 'x' or nothing.  If I have 'x', it must then be
> followed by a single 'x' or 'y'. 
> 2) If the rewritten form in (1) is indeed deterministic, will 
> parsers do
> the boolean algebra for me to get from the original content 
> model to the
> rewritten one, or is this going to be a point of 
> incompatibility between
> parsers?  XML-Spy seems very happy to validate using either form.
> -- 
> Steve Rosenberry
> Sr. Partner
> Electronic Solutions Company -- For the Home of Integration
> http://ElectronicSolutionsCo.com
> http://BetterGoBids.com -- The Premier GoTo Bid Management Tool
> (610) 670-1710
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@lists.xml.org