Lists Home |
Date Index |
Christian Nentwich wrote:
> Let me further this off-topic discussion by pasting my definition of
> semi-structured and unstructured from my Phd thesis (no
> self-agrandizing going on here.. just so happens that I was working in
> this area). HTML is disqualified from being structured under my
> "In our research we have focused on documents that contain structured
> or semi-structured data. That means that information in these documents
> is sufficiently grouped, or structured, to enable us to identify fragments
> of information and refer to them. This type of information encompasses
> the usual �text� forms of semi-structured, marked up documents like XML
> [Bray et al., 2000], database tables, source code files, and so on.
How does this disqualify HTML (or did you mis-type?)?
What is and HTML HEAD or BODY? I'd say they are part of the structure of
an HTML document (no matter what a browser may render).
and sure, people can mark up a list like:
<P>* item one</p>
<div>* item two</DIV>
but that is not how it is /supposed/ to be...
I'm sure many on this list will say so what, but so what - we are
talking about the defined structure of HTML.