[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*From*:**Derek Denny-Brown <derekdb@microsoft.com>***To*: Murali Mani <mani@CS.UCLA.EDU>*Date*: Wed, 13 Jun 2001 12:39:24 -0700

Actually ( (a,b) | (b,a) ) is not ambiguous. ( I assume this is what you meant by "( a, b | b, a)".. ) I'm not sure about the term '1-ambiguous' though, so you may be including other stipulations, which I am not aware of. A content model is just a regular expression. There exists a proof that regular expressions are isomorphic to NDFAs. There also exists a proof that NDFAs are isomorphic to DFAs, which is what I am talking about when I say, every non-deterministic content model can be (re)written as a deterministic content model. non-deterministic content models are content models whose direct conversion to a finite automata, results in a NDFA. A deterministic content model is one whose direct conversion is a DFA. I say NDFAs and DFAs are isopophic because any given DFA is a valid NDFA, and a equivalent DFA can be constructed for any NDFA. You seem to have a misconcieved understanding of 'ambiguous'. > Consider another regular expression -- (a*, b*, a*) -- it is > ambiguous, > and it has no 1-unambiguous content model (consider any of > the existing > operators) -- (actually please double check this example -- i am quite > sure it is not 1-unambiguous). ( a*, b*, c*) is _not_ ambiguous. At no point it processing a matching sequence, is the parser unsure what to do next given the current token. > Consider regular expressions -- the regular expression (a, b > | b, a) can > *never* be written as a 1-unambiguous regular expression -- > it can however > be written as (a & b) -- when it will be 1-unambiguous. I'm not sure exactly how you meant (a, b | b, a) to be parsed. Either ( ( a, b) | (b, a) ) or ( a, (b | b), a) The doesn't equate with (a & b), so I will assume you refer to the first. The first _is_ a 1-unambiguous regular expressions. Consider processing the 2 valid input streams: "ab" and "ba" - "ab": 'a' matches first-set in first group of the or group. All that is left is to match the following 'b' in (a,b). - "ba": 'b' matches first-set in second group of the or group. All that is left is to match the following 'a' in (b,a). No ambiguities. > she leaves it as an open problem: given a regular language, > whether there > is a 1-unambiguoud model group (with the SGML DTD operator & > in addition > to the regular expression operators). Note that she is talking about "1-unambiguous model groups", not "1-unambiguous regular expressions". There is a very significant difference. The example you give above of: ( ( a, b) | (b, a) ), can not be expressed as a 1-unambiguous model group, w/o the '&' operator, but it can be expressed as a 1-unambiguous regular expression. My original statement still holds. You can convert an arbitrary ambigous content model to an non-ambiguous... but it might involve the introduction of additional groups... thus it's avoidance of the open problem to which you refered. Derek Denny-Brown

- Prev by Date:
**Re: SAX 2.0 enhancement proposal** - Next by Date:
**Re: SAX 2.0 enhancement proposal** - Previous by thread:
**Common Structures (was Re: Non-deterministic content model)** - Next by thread:
**RE: Non-deterministic content model** - Index(es):