OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Is HTML structured or unstructured information?

[ Lists Home | Date Index | Thread Index ]

Bullard, Claude L (Len) wrote:
> ...
> HTML is the example many think they understand. 
> HTML is not just a presentational vocabulary.
> META tags, for example, are not presentational.
> FORM tags aren't strictly presentational.  Even 
> DIVs aren't strictly presentational.  In fact, 
> almost any tag has aspects of presentation and 
> content (note I am not using the term 'semantic' 
> here because presentation is a semantic). The 
> principle 'separation of presentation and content' 
> is flaky in practice.
> ...

hi, Claude

IMHO, presentation is not semantic : semantic is used for terms that 
means something ; you will say that "<b>" and "<i>" means "bold" and 
"italic", but as a meaning of a tag applies on its content, you can't 
say that :
"my name is <b>Philippe Poulard</b>" has not the same meaning that :
"my name is Philippe Poulard" ;
thus, "<b>" and "<i>" are semantically transparent

<b> has no semantic, it is just a stylistic information for a formatter
<i> has no semantic, it is just a stylistic information for a formatter
<title> has a semantic, and a formatter also use it as a stylistic 
information, but other tools (indexer) may use it with more importance 
than the rest (that is almost consider as plain-text)

(X)HTML is 90% non-semantic

Docbook defines more abstract presentation stuff that are not semantic, 
but it also defines more semantic structures (for example, things 
related to authors, etc)

(X)HTML is structured as well as RDBMS

A table in an RDBMS is structured, and usually semantic, but it is not 
an obligation : one could design a table named "paragraph" that contains 
  1 column "normal" and 1 column "bold", to express the same thing as 
above :
|        paragraph          |
|normal    |bold            |
|my name is|Philippe Poulard|
but it is certainly a bad idea :)

the real question is not about "structured or unstructured" information, 
because by definition markup languages ARE structured, but rather about 
"semantic or not semantic" : XML as well RDBMS may structure both 
semantic and non-semantic information


           (. .)
|   Philippe Poulard    |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS