[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*From*:**Murata Makoto <mura034@attglobal.net>***To*: "K.Kawaguchi" <k-kawa@bigfoot.com>*Date*: Fri, 02 Feb 2001 13:36:29 +0900

K.Kawaguchi wrote: > > > It's not in general easy, unless you restrict the grammar. For example, > > > consider the following TREX pattern: > > > > Right. Kawaguchi-san's another algorithm determines if a RELAX module is > > already such a restricted grammar. > > I don't think so. It is true that a restricted grammar results in > FASTER algorithm, but it's also true that type-assignment is still > tractable and in fact quite simple even for unrestricted grammars. Well, this is a subjective issue. You modified my algorithm so as to record possible interpretations. You think the algorithm is "quite simple", and James thinks it is not easy. I do not think that it is difficult, but also do not think it is very easy. 1. Ambiguity If the grammar is guaranteed to be unambiguous, the algorithm does not become simpler but the type-assignment is always unique. Any ambiguous RELAX module can be disambiguated, I guess. At least, any tree regular language can be captured by an unambiguous tree regular grammar. (Hmm, subset construction certainly creates deterministic and thus unambiguous hedge automata. But are there any algorithms which do not significantly rewrite RELAX modules? Interesting but might not be really useful.) There are many ambiguity in computer science. For example, a regular expression a*a* is ambiguous. So is (a | a). Perl and other tools happily accept such regular expressions. They also have \1, \2 , etc. so as to reference to matches to sub exprssions. For example, if we compare "aab" against \(a*\)b, the value of \1 is "aa". What happens if we compare "aaa" against (\(a*\)*)? This expression is not complicated and different implementations probably behave the same. But if we create more complicated ambiguous regular expressions, different implementations choose different interpretations (i.e., type assignment) and behave differently. Hmm, do we have to care? 2. Effortless If the grammar is guaranteed to be "effortless" (in your terminology), the algorithm can be simplified and the type-assignment is always unique. I have called this algorithm "semi-deterministic top-down with one lookahead". Simply put, you can determine *the* non-terminal for a node when you visit the node by using the type and content model of the parent element. Some RELAX modules cannot be converted to "effortless" RELAX modules. Given two such modules, its intersection can be always captured by an "effortless" module but its union cannot be always captured. That is, this class is not closed under union (and difference). K.Kawaguchi wrote: > See my previous post to Norm, which describes it step by step. It works > with TREX. I could not find it. Which message are you talking about? But we have common understanding. Cheers,

**Follow-Ups**:**Re: Type-assignment***From:*"K.Kawaguchi" <k-kawa@bigfoot.com>

**References**:**Re: Type-assignment***From:*"K.Kawaguchi" <k-kawa@bigfoot.com>

- Prev by Date:
**Re: Are we losing out because of grammars?** - Next by Date:
**Re: Type-assignment (was Re: Are we losing out because of grammars?)** - Previous by thread:
**Re: Type-assignment** - Next by thread:
**Re: Type-assignment** - Index(es):