[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Non-deterministic content model
- From: Murali Mani <mani@CS.UCLA.EDU>
- To: Derek Denny-Brown <firstname.lastname@example.org>
- Date: Wed, 13 Jun 2001 11:59:40 -0700 (PDT)
Derek, I am sorry I would like to say that the following statement is
*not* true to the best of my knowledge --
"Every non-deterministic content model can be written as a deterministic
shall I say the above is also called 1-unambiguous content models studied
by Anne-Bruggemann Klein in her thesis.
I would also like to re-introduce the term model group -- a model group
has lot of operators, a regular expression is also a model group -- it
used only the operators -- *, | (choice), and , (concatenation).
Consider regular expressions -- the regular expression (a, b | b, a) can
*never* be written as a 1-unambiguous regular expression -- it can however
be written as (a & b) -- when it will be 1-unambiguous.
Consider another regular expression -- (a*, b*, a*) -- it is ambiguous,
and it has no 1-unambiguous content model (consider any of the existing
operators) -- (actually please double check this example -- i am quite
sure it is not 1-unambiguous).
Note -- Anne's thesis includes a portion where given a regular language,
she identifies whether there is a 1-unambiguous regular expression for it.
she leaves it as an open problem: given a regular language, whether there
is a 1-unambiguoud model group (with the SGML DTD operator & in addition
to the regular expression operators).
In short, i think there exist inherently 1-ambiguous regular languages.
<warning>speaking for himself only</warning>
cheers and regards - murali.
On Wed, 13 Jun 2001, Derek Denny-Brown wrote:
> You are correct that the rewritten version is deterministic.
> It is provable that _every_ non-deterministic content model can be
> converted to a deterministic content model. The trick is basically to
> manually 'unroll' the content model. A non-deterministic content model
> is non-deterministic _because_ a direct conversion of the content model
> to a state diagram produces a state diagram with a single state and
> multiple outgoing edges for the same match. To make it determinisitic,
> collapse the paths together. (A formal discussion of this exists in
> almost any computation theory book, and in the book "Compilers,
> Principles, Techniques, and Tools" by Aho, Sethi, and Ullman)
> MSXML allows non-deterministic content models in DTDs, partially for
> that reason, but IBM's parser throws an error, so you shuld avoid them
> for strict parser compatibility.
> Derek Denny-Brown
> > -----Original Message-----
> > From: Steve Rosenberry
> > [mailto:Steve.Rosenberry@ElectronicSolutionsCo.com]
> > Sent: Wednesday, June 13, 2001 10:55 AM
> > To: email@example.com
> > Subject: Re: Non-deterministic content model
> > Marcus Carr wrote:
> > >
> > > It would be non-deterministic if it the processor was faced
> > > with element x in two contexts (for example), such as:
> > >
> > > <!ELEMENT a ((x, y) | (x, x))?>
> > >
> > Ok, now that I think I understand deterministic vs. non-deterministic
> > that leads to a couple of other questions:
> > 1) Would rewriting the above as <!ELEMENT a (x,(x|y))?> convert the
> > original content model to a deterministic one? As I now
> > understand the
> > rules, I believe the rewritten content model to be deterministic. At
> > 'a' I can either have 'x' or nothing. If I have 'x', it must then be
> > followed by a single 'x' or 'y'.
> > 2) If the rewritten form in (1) is indeed deterministic, will
> > parsers do
> > the boolean algebra for me to get from the original content
> > model to the
> > rewritten one, or is this going to be a point of
> > incompatibility between
> > parsers? XML-Spy seems very happy to validate using either form.
> > --
> > Steve Rosenberry
> > Sr. Partner
> > Electronic Solutions Company -- For the Home of Integration
> > http://ElectronicSolutionsCo.com
> > http://BetterGoBids.com -- The Premier GoTo Bid Management Tool
> > (610) 670-1710
> > ------------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org, an initiative of OASIS
> > <http://www.oasis-open.org>
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> > To unsubscribe from this elist send a message with the single word
> > "unsubscribe" in the body to: firstname.lastname@example.org
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: email@example.com