OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Reality check needed ....

[ Lists Home | Date Index | Thread Index ]

8/7/2002 8:49:48 AM, "Thomas B. Passin" <tpassin@comcast.net> wrote:


>But file extensions are __very helpful__ for humans, just like
>human-readable element names are.

Back to the original question, "why might Microsoft think that
XML database technologies could help people find things on their
personal hard drives more effectively," this discussion of filenames
suggests a few things.

- Filesystems are hiearchical, XPath was both designed to
  query hierarchies and modelled on the filesystem paradigm.
  Queries such as "The HTML file that is in
  a directory called "samples" somewhere under 'Program Files'
  that I modified in June 2002"  XPath could handle the 
  "samples directory somewhere under 'Program Files' much better
  than SQL could, AFAIK.

- File content is becoming more XML-like.   If the system indexed the
  HTML after parsing into a well-formed tree, you could use XPath
  to find content within tables, or div tags, or other structuring
  mechanisms, that would be difficult with SQL or full-text searches.

- "Real" XML is becoming more pervasive.  Presumably the XML formats
   of OpenOffice/StarOffice and (maybe) Office 11 files 
   could be exploited to find "the section labelled
   'Afghanistan' within the section labelled 'Wars' containing
   the word 'helicopter crash'" or whatever.  

Back to the file extensions, the OS could keep track of metadata to 
"know" that a particular file is XHTML, or SVG, or XSLT, irrespective
of the extension.  (By keeping track of what application edited the
file last).  3rd-party indexers could do the same thing by 
sniffing for namespaces or validating or whatever.

Anyway, this is starting to make sense ... the OS or a 3rd-party filesystem
indexer has a combination of information about a file's metadata (mod
date, size, owner), its content (type inferred somehow, possibly its
hierarchical internal structure), and its position in the filesystem
hierarchy.  Querying all that hierarchical data and metadata simultaneously
DOES sound like a job for XQuery, or SQL+XPath, or XPath+a join mechanism,
or whatever. 


  






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS