OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Another XML parsing idea? Was: Re: [xml-dev] XML Hangover)

[ Lists Home | Date Index | Thread Index ]
  • To: Michael Kay <mike@saxonica.com>, 'Pete Cordell' <petexmldev@tech-know-ware.com>, xml-dev@lists.xml.org
  • Subject: Another XML parsing idea? Was: Re: [xml-dev] XML Hangover)
  • From: Mukul Gandhi <mukul_gandhi@yahoo.com>
  • Date: Wed, 13 Jul 2005 12:01:01 -0700 (PDT)
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=kV5WfirSQdy47Loqqf/KTv1uQrtqgZshYAAEoVN0uytzxi4HPo3pFk2dolmXZ+5SnrPJdCGYWYVj4G9CpJWhGb2oTPIhsiidydwTizBBecqKA5HQsINHHgxFjetqPELWf9PLMMdoOYk7NTBzfBEyAn2aTY9EkurIW4CkIoyCZS0= ;

Today, we have a paradigm in XML parsing of using APIs
like SAX or DOM. I was thinking of another approach to
parse XML documents. 

Can we have a protocol (instead of API) that will talk
between a application and the XML parser? This shall
make using a XML parser interoperable to the calling
application.. We could achieve this "we could have a
Microsoft XML parser serving Java program's XML
parsing request.."
 
Just now we have APIs like SAX and DOM and proprietary
Microsoft APIs.. Had we had some protocol similar to
HTTP, that talked between a application and parser, it
may help interoperability..

Is this sensible thinking? Is this idea conceptually
similar to StAX or .NET XmlReader parsing approach?

Regards,
Mukul

--- Michael Kay <mike@saxonica.com> wrote:

> The URL got truncated
> 
>
http://www.idealliance.org/proceedings/xml04/papers/111/mhk-paper.html
> 
> with ".html" at the end.
> 
> Michael Kay 
> 
> > -----Original Message-----
> > From: Mukul Gandhi [mailto:mukul_gandhi@yahoo.com]
> 
> > Sent: 13 July 2005 10:02
> > To: Michael Kay; 'Pete Cordell';
> xml-dev@lists.xml.org
> > Subject: RE: [xml-dev] XSL for non-XML input (Was:
> Re: 
> > [xml-dev] XML Hangover)
> > 
> > Hi Mike,
> >   I get error
> > HTTP 404 - File not found
> > 
> > --- Michael Kay <mike@saxonica.com> wrote:
> > >
> >
>
http://www.idealliance.org/proceedings/xml04/papers/111/mhk-paper.htm
> > 
> > Regards,
> > Mukul
> > 
> > >  
> > > Michael Kay 
> > > 
> > >  
> > > Going further, observing the idea of using out
> of
> > > band data (e.g. schema) to
> > > provide extra information to complete 'binary
> XML',
> > > could XSL (with suitable
> > > front ends) work on say an ASN.1 encoded X.509
> > > certificate (and ASN.1
> > > message definition) and produce, say, a PDF
> output?
> > >  
> > > Not that I have a need to do that right now! 
> I'm
> > > just interested to know
> > > whether XSL can be used as a kind of universal
> data
> > > translator.
> > >  
> > > Thanks,
> > >  
> > > Pete.
> > > --
> > > =============================================
> > > Pete Cordell
> > > Tech-Know-Ware Ltd
> > >
> >
>
-----------------------------------------------------------------
> > >                          for XML to C++ data
> binding
> > > visit
> > >                         
> > > http://www.tech-know-ware.com/lmx
> > >                          (or
> http://www.xml2cpp.com)
> > > =============================================
> > > 
> > > 
> > > ----- Original Message ----- 
> > > From: Michael Kay <mailto:mike@saxonica.com>  
> > > To: 'Joe Schaffner'
> <mailto:schaffner.joe@gmail.com>
> > >  ;
> > > xml-dev@lists.xml.org 
> > > Sent: Monday, July 11, 2005 9:00 PM
> > > Subject: RE: [xml-dev] XML Hangover
> > > 
> > >  
> > > 
> > > I've been reading the XML litterature. It's
> great.
> > > Just a few comments: 
> > >  
> > > Welcome on board. It's refreshing to get
> thoughtful
> > > comments from someone
> > > who's new to the game. 
> > >  
> > > XSL - XML Stylesheets is divided into two parts,
> > > XSL-T and XSL-FO.
> > >  
> > > The T part deals with templates and translation.
> > > Since HTML is valid XML, I
> > > guess I can parse my HTML using XSL-T to produce
> XML
> > > and vice versa. I don't
> > > understand why XSL-T refers to "nodes in an
> output
> > > tree". This suggests some
> > > kind of internal representation, but XML is
> > > perfectly good representation
> > > language. Don't <templates> merely write XML
> text to
> > > stdout?  
> > >  
> > > No, the result tree is completely abstract,
> there is
> > > no suggestion of an
> > > internal representation. In fact, for many XSLT
> > > processors, the "result
> > > tree" is represented internally as a stream of
> > > events, not as a linked
> > > collection of objects in memory. This concept of
> > > writing a tree, rather than
> > > writing text, however is extremely important.
> > > Firstly, it defines a
> > > separation of the information content of an XML
> > > document from the accidental
> > > aspects of its lexical representation -
> something
> > > that is sadly missing from
> > > the XML spec itself. In turn, this gives you a
> basis
> > > for defining a concise
> > > set of operators that are in some sense
> complete,
> > > composable and exhibit
> > > closure. In practical terms, it gives you the
> > > ability to write a series of
> > > transformations - a pipeline - in which the
> > > expensive steps of serializing
> > > and parsing intermediate results can be
> eliminated. 
> > >  
> > > Roughly, the process seems to work like this:
> the T
> > > processor does a
> > > recursive descent of the source XML. At each
> node it
> > > evaluates the set of
> > > templates. Those templates which match the name
> of
> > > the "current" tag are
> > > processed, in some order. The template writes
> text,
> > > that's why it's called a
> > > "template. The recursive descent is continued
> with
> > > an <apply-templates> tag
> > > inside the template. This allows you to balance
> > > output.  
> > >  
> > > It doesn't have to do a recursive descent of the
> > > source XML: that's up to
> > > the application, though a recursive descent is
> the
> > > most common design
> > > pattern. And it definitely doesn't write text:
> > > people who create a mental
> > > model of writing text eventually get a rude
> > > awakening, usually when they
> > > first try to tackle grouping problems.
> > >  
> > > If no matches are found, the T processor
> continues
> > > the descent.
> > >  
> > > There is a <template> tag (I forget what) which
> will
> > > select arbitrary paths
> > > in the souce tree, and there are tags which
> iterate
> > > through the result.  
> > >  
> > > Again, it's best to think of the stylesheet as
> > > containing nodes
> > > (representing instructions) rather than tags.
> > > Consider
> > >  
> > > <xsl:element name="x"><xsl:value-of
> > > select="."/></xsl:element>
> > >  
> > > There are three tags there, but four nodes, and
> only
> > > two instructions. The
> > > semantics of the language are described in terms
> of
> > > the two instructions,
> > > not the three tags.
> > >  
> > >  This will allow me to build up a result "tree"
> > > which is not a mirror image
> > > of the source, something I need to do if I'm
> > > rearranging sections of the
> > > input document. Rather than buffering
> intermediate
> > > structures, the T
> > > processor does multiple passes based on these
> tags,
> > > and creates the output
> > > on-the-fly. Cool. 
> > >  
> > >  ... .
> > >  
> > > I assume there is nothing stopping me from using
> > > XSL-T to transform my HTML
> > > to PDF, but it seems best to output XSL-FO then
> > > create a PDF using some kind
> > > of tool. What is that tool? 
> > >  
> > > It's an XSL-FO processor. Examples are FOP,
> RenderX,
> > > Antenna House. 
> > >  
> > > Are there FO plug-ins available for my browsers?
> 
> > >  
> > > No, people are by-and-large using (X)HTML/CSS
> for
> > > the browser, XSL-FO/PDF
> > > for the printed page. 
> > >  
> > > Does this technology work? 
> > >  
> > > Absolutely yes. 
> > >  
> > > Michael Kay
> > > http://www.saxonica.com/
> > > 
> > > 
> > 
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam
> protection around 
> > http://mail.yahoo.com 
> > 
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS