xml-dev - Re: [xml-dev] Parsed Representation of an XML

Re: [xml-dev] Parsed Representation of an XML

[ Lists Home | Date Index | Thread Index ]

To: "Chiusano Joseph" <chiusano_joseph@bah.com>,Ram Menon <menon_dev@hotmail.com>
Subject: Re: [xml-dev] Parsed Representation of an XML
From: Kevin Jones <kjouk@yahoo.co.uk>
Date: Thu, 6 May 2004 10:12:31 +0100
Cc: xml-dev@lists.xml.org
In-reply-to: <4098EB64.852ADDC6@bah.com>
References: <BAY15-F27JkUwWpSWer0000358f@hotmail.com> <4098EB64.852ADDC6@bah.com>
Reply-to: kjouk@yahoo.co.uk
User-agent: KMail/1.5.4


There is a practical problem with doing this with XSLT in that the access path 
would often be very general unless you also have a schema for the input 
thanks to the template/default processing rules. There has been some work on 
XQuery with 'projections' of documents which is roughly equivalent that 
appears to give good results. The other issue that you might have depending 
on environment is that the processing to be applied to a document is not 
determined until after it has been parsed and inspected, say via XPath. 

On the packaging side there are many 'better' models than DOM. The difficult 
bit of the problem is finding one or a combination of models that is better 
for the type of processing you want to perform. This is also a function of 
how hard you are prepared to work on your processing algorithms to make them 
work over specific models. A good example is C14N over streamed data, much 
harder to do than over an object model but potentially good performance 
payback if you do it.

Kev.

On Wednesday 05 May 2004 14:25, Chiusano Joseph wrote:
> I'm certain that this is already being done in various products, by some
> vendors that are no doubt on this listserv but cannot admit to this due
> to the proprietary techniques of their products.
>
> Kind Regards,
> Joe Chiusano
> Booz | Allen | Hamilton
> Strategy and Technology Consultants to the World
>
> Ram Menon wrote:
> > Hi All,
> >
> >   At a high level, XML Processing could involve the following steps.
> >    1) Read the XML file
> >    2) Parse the XML to an in-memory representation
> >    3) Use the Parsed Representation to extract values, format values
> > through XSLT, etc, and so on.
> >
> > What I wanted to know is the fact as to why do not have a parsed
> > representation based on the access pattern and usage of the parsed
> > document ?
> > For e.g. the XSLT might use the document to retrieve three values from
> > one particular subtree, or maybe process all the children at a particular
> > depth within a subtree.
> >
> > WHy not have another input to the parser, which is, an abstract
> > representation of the access pattern, and then the in-memory document be
> > optimized for that particular pattern? [i.e optimal in terms of the
> > access time and memory usage].
> >
> > i.e.
> > XML file + Access-Metadata -------**XML Parser** -->Optimal Internal
> > Representation
> >
> > The DOM internal representation fundamentally is a single instance of a
> > particular nature of "Packing" of the XML. This form of "Packing" may not
> > be beneficial for certain use-cases. Why not think out of the box and
> > come up with some different sort of packing that allows all the required
> > nodes to be "close" to each other, to facilitate fast traversal, and
> > maybe lower memory usage by the fact the parsing only generates a partial
> > document which is just what might be required.
> > For e.g. one particular sdenario might be the "inversal" of the XML
> > structure, as such; [I am just choosing this ad-hoc];i.e. the "supposed
> > to be" leaf nodes of the parsed tree appear as the top level elements
> > within the parsed representation, and each of them have a reference [in
> > the form of some attribute or something on those lines] to their parents
> > along with them; very similar to viewing a n-ary tree reversed. Another
> > form of packing could be a "cube" like packing, where we build a
> > "multi-dimensional data structure" based on the structure of the XML
> > content. The cube can be accessed from all six of its faces, which might
> > correspond to the principally accessed members within the document. All
> > these are a subset of the possible structures that could be generated as
> > a result of parsing the XML. Each of these structures have definitive
> > traversal patterns and costs.
> >
> > This might seem a very vague idea, but would be good if somebody can
> > build on it for better.
> >
> > rgds,
> > Ram Menon
> >
> > _________________________________________________________________
> > Get head-hunted by 10,000 recruiters.
> > http://go.msnserver.com/IN/35984.asp Post your CV on naukri.com today.
> >
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://www.oasis-open.org/mlmanage/index.php>

References:
- Parsed Representation of an XML
  - From: "Ram Menon" <menon_dev@hotmail.com>
- Re: [xml-dev] Parsed Representation of an XML
  - From: "Chiusano Joseph" <chiusano_joseph@bah.com>

Prev by Date: RE: [xml-dev] XPath, count function
Next by Date: Re: [xml-dev] XML compaction software of a new version
Previous by thread: Re: [xml-dev] Parsed Representation of an XML
Next by thread: Re: [xml-dev] Parsed Representation of an XML
Index(es):
- Date
- Thread