OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] interleave algorithm?



Hi Kohsuke,

(Humblest apologies for indavertently first posting this directly to you.
Note that this version has been changed a bit from that sent previously).

> I'm guessing that the strategy you employed is known as "the back track
> algorithm"

Maybe so, but I can't see any backtracking happening.  The tree structure of
the element nodes seems to obviate the need for backtracking.  Either a given
element's content validates against  it's model - or the whole document is
invalid -  n'est-ce pas?  Or is it that I have implicitly assumed that the
content models are deterministic, and that therefore there no backtracking is
required?  (As I said - a little knowledge...  a _very_ little knowledge! <g>)

> There is an implementation of TREX (which is very similar to RELAX NG)
> in Python at http://pytrex.sourceforge.net/ which uses the same
> backtrack algorithm, so I think this would help you.

Yes, I'm sure it will - thanks for the link!

> However, still I'm not sure how you implement it.
>
> You wrote you pass a DOM element to the content model class (I guess
> these are like Choice, Group, Interleave, ... )

Yes - except that I don't yet have a working "Interleave" class! <g>   I _do_
have a "Mixed" model, but that of course was a ridiculously simple special
case to deal with.

> But these content model
> classes should accept (or reject) a sequence of DOM Nodes, instead of
> one DOM element.

??? The DOM specifies the :firstChild() and :nextSibling() methods. So the
content Model takes the Element to be matched, uses its firstChild() method,
and subsequently iterates through the child nodes using the nextSibling()
method,  to validate the content.  Am I missing something here?  Is there some
reason _not_ to do this?  My DOM implementation uses a linked list to model
the sequence of nodes, so the nextSibling() call is very efficient.  I am not
meaning to be rhetorical or sarcastic here.  I am genuinely puzzled by your
question.

cheers,

gary

(PS.  I also did an (entirely separate) implementation of  Regexps as per the
XML Schemas Appendix F, in pure Xbase++, which definitely does use the
backtrack algorithm. Consequently it is unusably slow for matching strings
larger than 20K or so.  I have been searching for some way of speeding it up -
without loss of functionality, but all the C implementations I have studed at
so far have been just too difficult for me to understand - let alone hack to
conform to the XML Schemas spec. )