OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Is HTML structured or unstructured information?

[ Lists Home | Date Index | Thread Index ]

Peter Hunsberger wrote:
> On 8/11/05, Philippe Poulard <Philippe.Poulard@sophia.inria.fr> wrote:
>>Bullard, Claude L (Len) wrote:
>>>HTML is the example many think they understand.
>>>HTML is not just a presentational vocabulary.
>>>META tags, for example, are not presentational.
>>>FORM tags aren't strictly presentational.  Even
>>>DIVs aren't strictly presentational.  In fact,
>>>almost any tag has aspects of presentation and
>>>content (note I am not using the term 'semantic'
>>>here because presentation is a semantic). The
>>>principle 'separation of presentation and content'
>>>is flaky in practice.
>>hi, Claude
>>IMHO, presentation is not semantic : semantic is used for terms that
>>means something ; you will say that "<b>" and "<i>" means "bold" and
>>"italic", but as a meaning of a tag applies on its content, you can't
>>say that :
>>"my name is <b>Philippe Poulard</b>" has not the same meaning that :
>>"my name is Philippe Poulard" ;
> Sure you can: the code that decides how to display the two will most
> likely decide they have different meaning.... Semantics aren't just
> for humans any more!
> <snip>conclusions that follow from the assumption that only humans
> care about the meaning of tags</snip>

ok, let's have more info : instead of <b>, let's use <xhtml:b> (with the 
right namespace declaration) ; one can decide that <b> is used for 
naming of a person (why not, even if it is certainly a bad choice), but 
one can't decide the same for <xhtml:b>, because it really stands for 
"bold" and nothing else

a code that respect standards will no longer decide what <xhtml:b> is for

>>the real question is not about "structured or unstructured" information,
>>because by definition markup languages ARE structured, but rather about
>>"semantic or not semantic" : XML as well RDBMS may structure both
>>semantic and non-semantic information
> Respectfully disagree: structure and semantics are in the eye of the
> beholder: tell me is a blob of XML stored in a RDB structured or not? 
> Does the same blob have any semantic meaning? What if the RDB can
> parse the blob into a SOAP descriptor? What if it used a grammar
> stored in another blob to do so?

if the semantic structure of a blob in an RDBMS tells that it is XML, 
then it is structured (whether this structure is easily accessible or 
not is another story)

-what about reading an XML file as binary data ?
-what about reading the files where are stored the tables of your RDBMS 
in a vendor-dependant binary format ?

if you ignore the structure, you won't have structured data


           (. .)
|   Philippe Poulard    |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS