OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] parser models

[ Lists Home | Date Index | Thread Index ]

Aleksander Slominski <aslom@cs.indiana.edu> wrote:
| Arjun Ray wrote:

| i believe that XML databinding (such as in your Element/Content framework)

Although it could be used for "databinding", that isn't the point of the
framework at all, as a matter of fact.  You must have missed my "names are
instrumental" rants. :-) 

| can be expressed easier using pull parsing and is both easier to understand
| and to maintain (debug) 

In the "databinding" applications I've seen, the structures rarely go more
than three significant levels deep, have no recursive structures in the
data, and tend to consist of sequences in fixed/predictable order.  It's
all very flat and a pull approach can work well.

| why should i a priori limit my options only to OO

I have no idea.  I did say, whatever floats your boat ;-)

| i do not think it is that simple. take for example SAX parsing code
| generated by JaxMe (see *Handler.jave)
| 
|   http://www.extreme.indiana.edu/~aslom/xml/databinding/jaxme-xmlpull/jaxme_generated/phone/

You're right, it's very cluttered.  But I wasn't talking about SAX,
either.

| and compare it with code that is using xml pull parsing
| 
|   http://www.extreme.indiana.edu/~aslom/xml/databinding/jaxme-xmlpull/phone/
| 
| they are both equivalent in functionality but which is easier to understand?

Neither one is particularly easy or difficult.  Think "databinding", watch
for field assignments, and the rest is framework overhead.  If you know
the framework - always a big if! ;-) - the overhead becomes scrutable as
you've learned to recognize the boilerplate ("yeah, that's how it's
supposed to go").

|> So, nextTag ignores everything until the next starttag event?  Shouldn't
|> you have a switch-block here for the general case (whitespace?  processing
|> instruction?)
| 
| i think that by using default SAX2 content handler you also ignore it ...

Yes. I caught up on the XmlPullParser docs and found out that nextTag()
and nextText() are filters around the "real" event puller.  My question
was already answered.  Sorry about that.

| (and it is also missing form Element interface)

They don't belong there anyway (rather, in the Content interface), but I
also left a bushel of details out, for simplicity.  In full detail the
complete set of interfaces cater to a number of SGML-isms too.  This is a
as yet half-cooked redevelopment in Java of an older system in Perl. 
 
|> Here, you're locked into addElement()-ing into Vector rows.
| 
| i have only added Vector as an example of doing something useful
| with "tr" elements (there was nothing in your example - it seems that
| HtmlTr object instances were constructed and then discarded ...)

All the magic is supposed to happen in the endChild(child) call.  I did
say my example was cheesy - the point was to focus on how the endChild()
method fit into the framework.  Writing an app exactly this way has a
number of avoidable problems (such as the one I was complaining about!):
the better approach is to separate the implementations of Element and
Content interfaces.  

|> I just find a push API more amenable to separation of functions.
| 
| i am not sure what does it mean? what are the functions you have in 
| mind?

By "functions" I didn't mean function/method calls.  I meant "things to be
done and variations thereof".  Lots of classes/objects, polymorphism (for
dispatch - no switch(){} logic!), and subclassing for customization.  It
isn't everyone's ticket.

Have you seen Oleg Kiselyov's foldts recursion scheme?

 http://pobox.com/~oleg/ftp/papers/XML-parsing.ps.gz

Passing "seeds" up and down a tree is similar to the patterns I'm trying
to develop. 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS