[
Lists Home |
Date Index |
Thread Index
]
Hi All,
At a high level, XML Processing could involve the following steps.
1) Read the XML file
2) Parse the XML to an in-memory representation
3) Use the Parsed Representation to extract values, format values through
XSLT, etc, and so on.
What I wanted to know is the fact as to why do not have a parsed
representation based on the access pattern and usage of the parsed document
?
For e.g. the XSLT might use the document to retrieve three values from one
particular subtree, or maybe process all the children at a particular depth
within a subtree.
WHy not have another input to the parser, which is, an abstract
representation of the access pattern, and then the in-memory document be
optimized for that particular pattern? [i.e optimal in terms of the access
time and memory usage].
i.e.
XML file + Access-Metadata -------**XML Parser** -->Optimal Internal
Representation
The DOM internal representation fundamentally is a single instance of a
particular nature of "Packing" of the XML. This form of "Packing" may not be
beneficial for certain use-cases. Why not think out of the box and come up
with some different sort of packing that allows all the required nodes to be
"close" to each other, to facilitate fast traversal, and maybe lower memory
usage by the fact the parsing only generates a partial document which is
just what might be required.
For e.g. one particular sdenario might be the "inversal" of the XML
structure, as such; [I am just choosing this ad-hoc];i.e. the "supposed to
be" leaf nodes of the parsed tree appear as the top level elements within
the parsed representation, and each of them have a reference [in the form of
some attribute or something on those lines] to their parents along with
them; very similar to viewing a n-ary tree reversed. Another form of packing
could be a "cube" like packing, where we build a "multi-dimensional data
structure" based on the structure of the XML content. The cube can be
accessed from all six of its faces, which might correspond to the
principally accessed members within the document. All these are a subset of
the possible structures that could be generated as a result of parsing the
XML. Each of these structures have definitive traversal patterns and costs.
This might seem a very vague idea, but would be good if somebody can build
on it for better.
rgds,
Ram Menon
_________________________________________________________________
Get head-hunted by 10,000 recruiters. http://go.msnserver.com/IN/35984.asp
Post your CV on naukri.com today.
|