[
Lists Home |
Date Index |
Thread Index
]
* Uche Ogbuji <uche.ogbuji@fourthought.com> [2004-12-26 13:12]:
> Alan Gutierrez wrote:
>
> >* Jeff Rafter <lists@jeffrafter.com> [2004-12-23 13:43]:
> >
> >
> >>>While on the topic of SAX taming features in Amara, there is also
> >>>amara.saxtools.xpattern_sax_state_machine, which I didn't even bother
> >>>mentioning in the announcement (too much to cram in).
> >>>
> >>>
> >>Can you expand on your expansion? As I was reading this I was thinking
> >>that in the Java/C# world an interesting approach would be to keep a
> >>pseudo DOM stack for the event hierarchy. Maybe something where you keep
> >>everything at an ancestral level intact while parsing
> >>
> >>
> >><foo>
> >> <bar1>
> >> <baz1/>
> >> <baz2/>
> >> </bar1>
> >> <bar2>
> >> <baz1>
> >> <sub/>
> >> </baz1>
> >> <baz2>text</baz2>
> >> </bar2>
> >></foo>
> >>
> >>So when the event stream reached /foo/bar2/baz2/text() you would have
> >>the following in a DOM like structure:
> >>
> >> foo
> >> \
> >> bar1 (... no children)
> >> bar2
> >> \
> >> baz1 (... no children, just the previous sibling and attrs)
> >> baz2 (only the StartTag)
> >>
> >>I am not sure that the preceding siblings would be very useful and have
> >>more chances for pathological cases but when I construct mini-trees this
> >>is the subset I find handy. It is useful when working with an editor to
> >>understand the immediate context. Unfortunately by requiring the
> >>previous siblings you have to maintain quite a bit more... the whole
> >>preceding branch of the tree.
> >>
> >>
> >
> > I have a SAX library (in Java) that keeps the stack around, but
> > not the preceeding siblings. It is quite useful.
> >
> > It is, actually, very useful to keep a stack around that has a
> > hash table for each level of the stack, it allows for the
> > devleopment of strategies that are themselves stateless.
> >
> > Adding the implied stack goes a long way to make SAX event
> > processing a more practical solution for a lot of problems.
> >
> >
>
> Yes. This is a useful technique I covered for Python in my article
> "Location, Location, Location
> <http://www.xml.com/pub/a/2004/11/24/py-xml.html>":
> http://www.xml.com/pub/a/2004/11/24/py-xml.html
> I think that while useful this technique can still leave a lot of state
> wrangling to the programmer, which is why Amara has several modules that
> go further.
Yes. A lot is still left to the programmer with my tool set, but
it does pick up a lot common SAX tasks.
I've wondered about what more I could do.
Hmm.. Read the article. I was talking about how I keep a stack
of the elements around, and how a silly thing I did turns out to
be very useful. In the stack of events, for each event, I keep a
java.util.Map and tuck all sorts of things in there.
Twice now I've create a little langauge in XML and used SAX to
parse it. Once I understood what I could and could not do, it
got pretty easy to express a chore as an XML event stream. It
was easy to keep track of the chore by tucking state into the
java.util.Map. Kinda Perlish, but that's me.
I was wondering if I couldn't specify some of those invidual
chores within an XML Schema document. When a certian object is
found in the event stream, acording to XML Schema, Java source
could be executed, perhaps as a generated class with member
variables mapped to attributes or the values of childen.
I've thought about using an XPath tracker in error reporting to
my library, which would be very simple to add at this point, and
it's necessary, I think because the document locator loses
meaning when I chain together a bunch of SAX filters.
In any case, I'm reading through some of the other articles
you've been posting. This is a very interesting discussion.
Cheers.
--
Alan Gutierrez - alan@engrm.com
|