[
Lists Home |
Date Index |
Thread Index
]
> While on the topic of SAX taming features in Amara, there is also
> amara.saxtools.xpattern_sax_state_machine, which I didn't even bother
> mentioning in the announcement (too much to cram in).
Can you expand on your expansion? As I was reading this I was thinking
that in the Java/C# world an interesting approach would be to keep a
pseudo DOM stack for the event hierarchy. Maybe something where you keep
everything at an ancestral level intact while parsing
<foo>
<bar1>
<baz1/>
<baz2/>
</bar1>
<bar2>
<baz1>
<sub/>
</baz1>
<baz2>text</baz2>
</bar2>
</foo>
So when the event stream reached /foo/bar2/baz2/text() you would have
the following in a DOM like structure:
foo
\
bar1 (... no children)
bar2
\
baz1 (... no children, just the previous sibling and attrs)
baz2 (only the StartTag)
I am not sure that the preceding siblings would be very useful and have
more chances for pathological cases but when I construct mini-trees this
is the subset I find handy. It is useful when working with an editor to
understand the immediate context. Unfortunately by requiring the
previous siblings you have to maintain quite a bit more... the whole
preceding branch of the tree.
> This module takes an XPattern (e.g. "/xbel/folder/bookmark") and
> generates a state machine which can be plugged into any regular SAX
> handler. In this way, you can automatically look for certain XPatterns
> which have interesting bits of code for you to process, and ignore the
> rest. This is sort of the opposite of Tenorsax: embrace the state
> machine, but automate it, rather than sweeping it unto a fancy
> framework.
Karl Waclawek has done some work in this area in both Delphi and C# in
his toolkit XPEA. But I am sure he will take some ideas from this thread
as well... it is all very interesting.
Cheers,
Jeff Rafter
|