OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Non-deterministic content model




Derek, I am sorry I would like to say that the following statement is
*not* true to the best of my knowledge --

"Every non-deterministic content model can be written as a deterministic
content model"

shall I say the above is also called 1-unambiguous content models studied
by Anne-Bruggemann Klein in her thesis.

I would also like to re-introduce the term model group -- a model group
has lot of operators, a regular expression is also a model group -- it
used only the operators -- *, | (choice), and , (concatenation).

Consider regular expressions -- the regular expression (a, b | b, a) can
*never* be written as a 1-unambiguous regular expression -- it can however
be written as (a & b) -- when it will be 1-unambiguous.

Consider another regular expression -- (a*, b*, a*) -- it is ambiguous,
and it has no 1-unambiguous content model (consider any of the existing
operators) -- (actually please double check this example -- i am quite
sure it is not 1-unambiguous).

Note -- Anne's thesis includes a portion where given a regular language,
she identifies whether there is a 1-unambiguous regular expression for it.

she leaves it as an open problem: given a regular language, whether there
is a 1-unambiguoud model group (with the SGML DTD operator & in addition
to the regular expression operators).

In short, i think there exist inherently 1-ambiguous regular languages.

<warning>speaking for himself only</warning>

cheers and regards - murali.

On Wed, 13 Jun 2001, Derek Denny-Brown wrote:

> You are correct that the rewritten version is deterministic.
>
> It is provable that _every_ non-deterministic content model can be
> converted to a deterministic content model.  The trick is basically to
> manually 'unroll' the content model.  A non-deterministic content model
> is non-deterministic _because_ a direct conversion of the content model
> to a state diagram produces a state diagram with a single state and
> multiple outgoing edges for the same match.  To make it determinisitic,
> collapse the paths together.  (A formal discussion of this exists in
> almost any computation theory book, and in the book "Compilers,
> Principles, Techniques, and Tools" by Aho, Sethi, and Ullman)
>
> MSXML allows non-deterministic content models in DTDs, partially for
> that reason, but IBM's parser throws an error, so you shuld avoid them
> for strict parser compatibility.
>
> Derek Denny-Brown
>
> > -----Original Message-----
> > From: Steve Rosenberry
> > [mailto:Steve.Rosenberry@ElectronicSolutionsCo.com]
> > Sent: Wednesday, June 13, 2001 10:55 AM
> > To: xml-dev@lists.xml.org
> > Subject: Re: Non-deterministic content model
> >
> >
> >
> >
> > Marcus Carr wrote:
> > >
> > > It would be non-deterministic if it the processor was faced
> > > with element x in two contexts (for example), such as:
> > >
> > > <!ELEMENT a ((x, y) | (x, x))?>
> > >
> >
> >
> > Ok, now that I think I understand deterministic vs. non-deterministic
> > that leads to a couple of other questions:
> >
> > 1) Would rewriting the above as <!ELEMENT a (x,(x|y))?> convert the
> > original content model to a deterministic one?  As I now
> > understand the
> > rules, I believe the rewritten content model to be deterministic.  At
> > 'a' I can either have 'x' or nothing.  If I have 'x', it must then be
> > followed by a single 'x' or 'y'.
> >
> > 2) If the rewritten form in (1) is indeed deterministic, will
> > parsers do
> > the boolean algebra for me to get from the original content
> > model to the
> > rewritten one, or is this going to be a point of
> > incompatibility between
> > parsers?  XML-Spy seems very happy to validate using either form.
> >
> >
> >
> > --
> > Steve Rosenberry
> > Sr. Partner
> >
> > Electronic Solutions Company -- For the Home of Integration
> > http://ElectronicSolutionsCo.com
> >
> > http://BetterGoBids.com -- The Premier GoTo Bid Management Tool
> >
> > (610) 670-1710
> >
> > ------------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org, an initiative of OASIS
> > <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To unsubscribe from this elist send a message with the single word
> > "unsubscribe" in the body to: xml-dev-request@lists.xml.org
> >
>
> ------------------------------------------------------------------
> The xml-dev list is sponsored by XML.org, an initiative of OASIS
> <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To unsubscribe from this elist send a message with the single word
> "unsubscribe" in the body to: xml-dev-request@lists.xml.org
>