OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Indexing of XML documents

[ Lists Home | Date Index | Thread Index ]
  • From: Peter@ursus.demon.co.uk (Peter Murray-Rust)
  • To: xml-dev@ic.ac.uk
  • Date: Fri, 14 Mar 1997 23:19:46 GMT

I hope I can express this problem clearly - I'm sure that you are
familiar with it.

When we need to resolve a TEI pointer like (id a23) we may have to scan
the whole document.  In general we will wish to cache (index) IDs since
we don't wish to rescan for another search.  One obvious place to do this
is when the document is first read in (admittedly there may never be a need
to scan the whole document).

When validating a document the IDs, GIs and ATTNAMEs all have to be scanned
since they occur in VC's.  Presumably as a by-product of validation we can 
at least expect a hashtable of IDs (and possibly GIs).  

The question is, should we do both of these by default (or even others
that I haven't thought of)?  Or should we do none and leave it to the app?
Or should the parser have a switch?


[BTW a WF document can have multiple identical IDs, OK?  Presumably the
behaviour of an app that has to reference them is 'undefined'?]

Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS