xml-dev - Re: [xml-dev] JITTs and DOM

Re: [xml-dev] JITTs and DOM

[ Lists Home | Date Index | Thread Index ]

To: LMNL-DEV <lmnl-dev@lmnl.org>
Subject: Re: [xml-dev] JITTs and DOM
From: Gavin Thomas Nicol <gtn@rbii.com>
Date: Fri, 11 Oct 2002 10:28:51 -0400
Cc: <xml-dev@lists.xml.org>
In-reply-to: <6196993018.20021011125630@jenitennison.com>
Organization: Red Bridge Interactive, Inc.
References: <200210101305.JAA29379@mail2.reutershealth.com> <3DA69F39.3040302@emory.edu> <6196993018.20021011125630@jenitennison.com>
Reply-to: gtn@rbii.com

On Friday 11 October 2002 07:56 am, Jeni Tennison wrote:
> I understood that the *output* would be plain text, but I thought that
> the *input* would be marked-up text. This wasn't the case in the
> samples that you were using for your observations. I did see that you
> characterised them as "observations" and said that you would do more
> investigation, I just didn't want you or anyone else to get too
> hopeful about 30x speedup on the basis of these particular
> observations.

The samples are a demonstration only in that they create smallish DOM trees 
from potentially largish ones (i.e. certain markup is supressed). This would 
be like putting a SAX filter inline with structural stop lists (if you will).

As I noted, the speedup is most likely because the cost of building a DOM is 
usually in object construction, rather than in parsing.

> The other thing that I think is promising about the JITTs approach is
> the ability to parse just the bits of the document that you're
> interested in, on the fly, during processing. A DOM implementation
> that did this behind the scenes could be very effective. (I'm sure
> that native XML databases / content management systems do this kind of
> thing all the time; I don't know if any in-memory DOM implementations
> do, or if it's been tried and for some reason rejected?)

This has been done (laxy evaluation) in the past. From what I saw, the 
performance gain wasn't worth the complexity.

Part of the value of ARA is that it was explicitly design to support parallel 
parsing of documents. I'm not sure that JITT can be used in quite the same 
same way... or at least it'd be more complex because the implicit assumption 
is that you are operating in the context of a tree.

Follow-Ups:
- Re: [xml-dev] JITTs and DOM
  - From: Patrick Durusau <pdurusau@emory.edu>
- Re: [xml-dev] JITTs and DOM
  - From: Jeni Tennison <jeni@jenitennison.com>

References:
- Re: [xml-dev] JITTs and DOM
  - From: Patrick Durusau <pdurusau@emory.edu>
- Re: [xml-dev] JITTs and DOM
  - From: Jeni Tennison <jeni@jenitennison.com>

Prev by Date: What is Tag Soup? (was Re: [xml-dev] The Knights of Tag Soup (was Re: [xml-dev] RE: evolvable formats ))
Next by Date: Re: [xml-dev] What is Tag Soup?
Previous by thread: Re: [xml-dev] JITTs and DOM
Next by thread: Re: [xml-dev] JITTs and DOM
Index(es):
- Date
- Thread