OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML tools and big documents

[ Lists Home | Date Index | Thread Index ]
  • From: "Michael Kay" <M.H.Kay@eng.icl.co.uk>
  • To: <xml-dev@ic.ac.uk>
  • Date: Fri, 4 Sep 1998 11:59:51 +0100


>To this end, I have been (in such spare time as i have)
tinkering
>about with Mr. Clark's XP API (com.jclark.xml.tok, mostly)
to write an
>application that will allow me to attach the logical
element structure
>to offsets in the storage entity, so that I can consider
the logical
>structure's relationship to points in the text without
reparsing the
>document
I think we're all looking for a solution to the problem that
a >1Mb document is too big, we don't want to parse it every
time we want to look at it, but storing the fine-grained DOM
representation has the opposite problem, it takes too much
space and takes too long to reassemble a reasonable unit
like a page. Indexing the original serial XML (say at
"chapter" level) is one solution; it's essentially
equivalent to my approach, which has been to split the
original XML (say at "chapter" level) and store the
"chapters" as separate linked XML documents.

What I mean by "chapter" is typically 1-10Kb, or
alternatively, a chunk of text such that the user doesn't
mind pressing "Next" when he's got to the end of it.

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS