[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [xml-dev] interleave algorithm?
- From: Gary Stephenson <garys@ihug.com.au>
- To: xml-dev@lists.xml.org
- Date: Wed, 17 Oct 2001 13:48:40 +1000
Hi Kohsuke,
(Humblest apologies for indavertently first posting this directly to you.
Note that this version has been changed a bit from that sent previously).
> I'm guessing that the strategy you employed is known as "the back track
> algorithm"
Maybe so, but I can't see any backtracking happening. The tree structure of
the element nodes seems to obviate the need for backtracking. Either a given
element's content validates against it's model - or the whole document is
invalid - n'est-ce pas? Or is it that I have implicitly assumed that the
content models are deterministic, and that therefore there no backtracking is
required? (As I said - a little knowledge... a _very_ little knowledge! <g>)
> There is an implementation of TREX (which is very similar to RELAX NG)
> in Python at http://pytrex.sourceforge.net/ which uses the same
> backtrack algorithm, so I think this would help you.
Yes, I'm sure it will - thanks for the link!
> However, still I'm not sure how you implement it.
>
> You wrote you pass a DOM element to the content model class (I guess
> these are like Choice, Group, Interleave, ... )
Yes - except that I don't yet have a working "Interleave" class! <g> I _do_
have a "Mixed" model, but that of course was a ridiculously simple special
case to deal with.
> But these content model
> classes should accept (or reject) a sequence of DOM Nodes, instead of
> one DOM element.
??? The DOM specifies the :firstChild() and :nextSibling() methods. So the
content Model takes the Element to be matched, uses its firstChild() method,
and subsequently iterates through the child nodes using the nextSibling()
method, to validate the content. Am I missing something here? Is there some
reason _not_ to do this? My DOM implementation uses a linked list to model
the sequence of nodes, so the nextSibling() call is very efficient. I am not
meaning to be rhetorical or sarcastic here. I am genuinely puzzled by your
question.
cheers,
gary
(PS. I also did an (entirely separate) implementation of Regexps as per the
XML Schemas Appendix F, in pure Xbase++, which definitely does use the
backtrack algorithm. Consequently it is unusably slow for matching strings
larger than 20K or so. I have been searching for some way of speeding it up -
without loss of functionality, but all the C implementations I have studed at
so far have been just too difficult for me to understand - let alone hack to
conform to the XML Schemas spec. )