[
Lists Home |
Date Index |
Thread Index
]
I think HT said that there's always an overhead if you have to cross a
thread boundary, and that they try to avoid it whenever possible. Crossing a
process or machine boundary would be far worse.
Michael Kay
> -----Original Message-----
> From: Bob Foster [mailto:bob@objfac.com]
> Sent: 13 July 2005 22:18
> To: Michael Kay
> Cc: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] Another XML parsing idea? Was: Re:
> [xml-dev] XML Hangover)
>
> I don't know the internals (maybe someone can comment) but I believe
> Markup Technology has a protocol for passing PSVI around. It seems
> pretty darn fast.
>
> Bob Foster
>
> Michael Kay wrote:
> > A protocol implies sending and receiving messages
> typically across a
> process
> > boundary or even a machine boundary. This would raise the
> cost of XML
> > parsing by a couple of orders of magnitude.
> >
> > Michael Kay
> >
> >
> >>-----Original Message-----
> >>From: Mukul Gandhi [mailto:mukul_gandhi@yahoo.com]
> >>Sent: 13 July 2005 20:01
> >>To: Michael Kay; 'Pete Cordell'; xml-dev@lists.xml.org
> >>Subject: [xml-dev] Another XML parsing idea? Was: Re:
> >>[xml-dev] XML Hangover)
> >>
> >>Today, we have a paradigm in XML parsing of using APIs
> >>like SAX or DOM. I was thinking of another approach to
> >>parse XML documents.
> >>
> >>Can we have a protocol (instead of API) that will talk
> >>between a application and the XML parser? This shall
> >>make using a XML parser interoperable to the calling
> >>application.. We could achieve this "we could have a
> >>Microsoft XML parser serving Java program's XML
> >>parsing request.."
> >>
> >>Just now we have APIs like SAX and DOM and proprietary
> >>Microsoft APIs.. Had we had some protocol similar to
> >>HTTP, that talked between a application and parser, it
> >>may help interoperability..
> >>
> >>Is this sensible thinking? Is this idea conceptually
> >>similar to StAX or .NET XmlReader parsing approach?
> >>
> >>Regards,
> >>Mukul
> >>
> >>--- Michael Kay <mike@saxonica.com> wrote:
> >>
> >>
> >>>The URL got truncated
> >>>
> >>>
> >>
>
> >>http://www.idealliance.org/proceedings/xml04/papers/111/mhk-
> paper.html
> >>
> >>>with ".html" at the end.
> >>>
> >>>Michael Kay
> >>>
> >>>
> >>>>-----Original Message-----
> >>>>From: Mukul Gandhi [mailto:mukul_gandhi@yahoo.com]
> >>>
> >>>>Sent: 13 July 2005 10:02
> >>>>To: Michael Kay; 'Pete Cordell';
> >>>
> >>>xml-dev@lists.xml.org
> >>>
> >>>>Subject: RE: [xml-dev] XSL for non-XML input (Was:
> >>>
> >>>Re:
> >>>
> >>>>[xml-dev] XML Hangover)
> >>>>
> >>>>Hi Mike,
> >>>> I get error
> >>>>HTTP 404 - File not found
> >>>>
> >>>>--- Michael Kay <mike@saxonica.com> wrote:
> >>>>
>
> >>http://www.idealliance.org/proceedings/xml04/papers/111/mhk-
> paper.htm
> >>
> >>>>Regards,
> >>>>Mukul
> >>>>
> >>>>
> >>>>>
> >>>>>Michael Kay
> >>>>>
> >>>>>
> >>>>>Going further, observing the idea of using out
> >>>
> >>>of
> >>>
> >>>>>band data (e.g. schema) to
> >>>>>provide extra information to complete 'binary
> >>>
> >>>XML',
> >>>
> >>>>>could XSL (with suitable
> >>>>>front ends) work on say an ASN.1 encoded X.509
> >>>>>certificate (and ASN.1
> >>>>>message definition) and produce, say, a PDF
> >>>
> >>>output?
> >>>
> >>>>>
> >>>>>Not that I have a need to do that right now!
> >>>
> >>>I'm
> >>>
> >>>>>just interested to know
> >>>>>whether XSL can be used as a kind of universal
> >>>
> >>>data
> >>>
> >>>>>translator.
> >>>>>
> >>>>>Thanks,
> >>>>>
> >>>>>Pete.
> >>>>>--
> >>>>>=============================================
> >>>>>Pete Cordell
> >>>>>Tech-Know-Ware Ltd
> >>>>>
> >>>>
> >>-----------------------------------------------------------------
> >>
> >>>>> for XML to C++ data
> >>>
> >>>binding
> >>>
> >>>>>visit
> >>>>>
> >>>>>http://www.tech-know-ware.com/lmx
> >>>>> (or
> >>>
> >>>http://www.xml2cpp.com)
> >>>
> >>>>>=============================================
> >>>>>
> >>>>>
> >>>>>----- Original Message -----
> >>>>>From: Michael Kay <mailto:mike@saxonica.com>
> >>>>>To: 'Joe Schaffner'
> >>>
> >>><mailto:schaffner.joe@gmail.com>
> >>>
> >>>>> ;
> >>>>>xml-dev@lists.xml.org
> >>>>>Sent: Monday, July 11, 2005 9:00 PM
> >>>>>Subject: RE: [xml-dev] XML Hangover
> >>>>>
> >>>>>
> >>>>>
> >>>>>I've been reading the XML litterature. It's
> >>>
> >>>great.
> >>>
> >>>>>Just a few comments:
> >>>>>
> >>>>>Welcome on board. It's refreshing to get
> >>>
> >>>thoughtful
> >>>
> >>>>>comments from someone
> >>>>>who's new to the game.
> >>>>>
> >>>>>XSL - XML Stylesheets is divided into two parts,
> >>>>>XSL-T and XSL-FO.
> >>>>>
> >>>>>The T part deals with templates and translation.
> >>>>>Since HTML is valid XML, I
> >>>>>guess I can parse my HTML using XSL-T to produce
> >>>
> >>>XML
> >>>
> >>>>>and vice versa. I don't
> >>>>>understand why XSL-T refers to "nodes in an
> >>>
> >>>output
> >>>
> >>>>>tree". This suggests some
> >>>>>kind of internal representation, but XML is
> >>>>>perfectly good representation
> >>>>>language. Don't <templates> merely write XML
> >>>
> >>>text to
> >>>
> >>>>>stdout?
> >>>>>
> >>>>>No, the result tree is completely abstract,
> >>>
> >>>there is
> >>>
> >>>>>no suggestion of an
> >>>>>internal representation. In fact, for many XSLT
> >>>>>processors, the "result
> >>>>>tree" is represented internally as a stream of
> >>>>>events, not as a linked
> >>>>>collection of objects in memory. This concept of
> >>>>>writing a tree, rather than
> >>>>>writing text, however is extremely important.
> >>>>>Firstly, it defines a
> >>>>>separation of the information content of an XML
> >>>>>document from the accidental
> >>>>>aspects of its lexical representation -
> >>>
> >>>something
> >>>
> >>>>>that is sadly missing from
> >>>>>the XML spec itself. In turn, this gives you a
> >>>
> >>>basis
> >>>
> >>>>>for defining a concise
> >>>>>set of operators that are in some sense
> >>>
> >>>complete,
> >>>
> >>>>>composable and exhibit
> >>>>>closure. In practical terms, it gives you the
> >>>>>ability to write a series of
> >>>>>transformations - a pipeline - in which the
> >>>>>expensive steps of serializing
> >>>>>and parsing intermediate results can be
> >>>
> >>>eliminated.
> >>>
> >>>>>
> >>>>>Roughly, the process seems to work like this:
> >>>
> >>>the T
> >>>
> >>>>>processor does a
> >>>>>recursive descent of the source XML. At each
> >>>
> >>>node it
> >>>
> >>>>>evaluates the set of
> >>>>>templates. Those templates which match the name
> >>>
> >>>of
> >>>
> >>>>>the "current" tag are
> >>>>>processed, in some order. The template writes
> >>>
> >>>text,
> >>>
> >>>>>that's why it's called a
> >>>>>"template. The recursive descent is continued
> >>>
> >>>with
> >>>
> >>>>>an <apply-templates> tag
> >>>>>inside the template. This allows you to balance
> >>>>>output.
> >>>>>
> >>>>>It doesn't have to do a recursive descent of the
> >>>>>source XML: that's up to
> >>>>>the application, though a recursive descent is
> >>>
> >>>the
> >>>
> >>>>>most common design
> >>>>>pattern. And it definitely doesn't write text:
> >>>>>people who create a mental
> >>>>>model of writing text eventually get a rude
> >>>>>awakening, usually when they
> >>>>>first try to tackle grouping problems.
> >>>>>
> >>>>>If no matches are found, the T processor
> >>>
> >>>continues
> >>>
> >>>>>the descent.
> >>>>>
> >>>>>There is a <template> tag (I forget what) which
> >>>
> >>>will
> >>>
> >>>>>select arbitrary paths
> >>>>>in the souce tree, and there are tags which
> >>>
> >>>iterate
> >>>
> >>>>>through the result.
> >>>>>
> >>>>>Again, it's best to think of the stylesheet as
> >>>>>containing nodes
> >>>>>(representing instructions) rather than tags.
> >>>>>Consider
> >>>>>
> >>>>><xsl:element name="x"><xsl:value-of
> >>>>>select="."/></xsl:element>
> >>>>>
> >>>>>There are three tags there, but four nodes, and
> >>>
> >>>only
> >>>
> >>>>>two instructions. The
> >>>>>semantics of the language are described in terms
> >>>
> >>>of
> >>>
> >>>>>the two instructions,
> >>>>>not the three tags.
> >>>>>
> >>>>> This will allow me to build up a result "tree"
> >>>>>which is not a mirror image
> >>>>>of the source, something I need to do if I'm
> >>>>>rearranging sections of the
> >>>>>input document. Rather than buffering
> >>>
> >>>intermediate
> >>>
> >>>>>structures, the T
> >>>>>processor does multiple passes based on these
> >>>
> >>>tags,
> >>>
> >>>>>and creates the output
> >>>>>on-the-fly. Cool.
> >>>>>
> >>>>> ... .
> >>>>>
> >>>>>I assume there is nothing stopping me from using
> >>>>>XSL-T to transform my HTML
> >>>>>to PDF, but it seems best to output XSL-FO then
> >>>>>create a PDF using some kind
> >>>>>of tool. What is that tool?
> >>>>>
> >>>>>It's an XSL-FO processor. Examples are FOP,
> >>>
> >>>RenderX,
> >>>
> >>>>>Antenna House.
> >>>>>
> >>>>>Are there FO plug-ins available for my browsers?
> >>>
> >>>>>
> >>>>>No, people are by-and-large using (X)HTML/CSS
> >>>
> >>>for
> >>>
> >>>>>the browser, XSL-FO/PDF
> >>>>>for the printed page.
> >>>>>
> >>>>>Does this technology work?
> >>>>>
> >>>>>Absolutely yes.
> >>>>>
> >>>>>Michael Kay
> >>>>>http://www.saxonica.com/
>
>
|