OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Is HTML structured or unstructured information?

[ Lists Home | Date Index | Thread Index ]

Namaste, Mukul ji.

I'm a fast typist, but otherwise, only a philosopher and clown.

HTML is structured by definition.  The POV of the 
unstructured search engine varies tag by tag, node 
by node.  

The content of text nodes inside hs, ths, etc, make a lot 
of difference to an 'unstructured' search engine, or at 
least in the lion, gazelle and hunter model, those are 
the easiest lions to pick off before diving into the ps 
and the attributes of imgs.  Then there are the URIs 
themselves, but these can be controversial in some hunts.

The metas are gold if used well, but otherwise, just 
annotations.   The more the author annotates, the 
easier it is to clarify intent but not necessarily to 
establish truth.  For that reason, unstructured 
search systems are problematic the more a decision 
relies on them in real time or in a quantum field 
(the problem of a particle in superposition).


-----Original Message-----
From: Mukul Gandhi [mailto:mukul_gandhi@yahoo.com]

Hi Len,
  I am not as knowleadgeable as you are. But IMHO

  1) HTML is a language for displaying content of the
web page. As for any content, HTML can be treated as
unstructured information. 

  2) On the other hand, I think, since HTML is a
vocabulary for Web GUI, it can be treated as

  3) Since HTML is also a computer language, it can be
treated as such by many people.

  4) To my experiance, relational data can be treated
as structured information since relational data is
handled by SQL.

Just my opinion!


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS