OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Is HTML structured or unstructured information?

[ Lists Home | Date Index | Thread Index ]

Yes.  

All markup is a means to annotate implicit structure explicitly 
*for* the point of view of the semantic-laden processor, so unstructured 
to structured is a range of implicit to explicit labeling.   It is 
assumed that there is a POV processor that can use these labels in 
some meaningful way.  If you look at the UIMA architecture, that is 
precisely what it is designed to enable.  We'll leave POV evolution 
to another discussion which Roger Costello would enjoy (attractors).

Words don't have meaning.  People do. (old linguistics saying).
Tags don't have meaning.  Processors do.  Humans included.

The tricky part is determining the lifecycle of a fact within 
a scope of processing.  OTOH, that's what programmers do.

BTW and off topic:  did you know that in Berkeley California, 
programmers are not professionals?  They are technicians. 
Well, maybe not offtopic because their POV for labeling your 
role in a process determines your compensation relative to 
other labeled entities.

len


From: Peter Hunsberger [mailto:peter.hunsberger@gmail.com]

On 8/11/05, Philippe Poulard <Philippe.Poulard@sophia.inria.fr> wrote:

<snip/>

> all along, I take care to avoid using such terms "<xhtml:b> has meaning"
> because, as I was saying in a previous post, "semantic is used for terms
> that means something"
> I argue that "my name is <xhtml:b>Philippe Poulard</xhtml:b>" has the
> same meaning that "my name is Philippe Poulard"
> 
> to be coherent, I won't say that "<xhtml:b> means bold", I will say that
> "<xhtml:b> just stands for bold", because <xhtml:b> carries no meaning
> to its text data
> 
> the semantic applies on the content, not on the container : <author>
> can't be an author, it can only contain a text that corresponds to a
> person name that is (should be) an author

Why can't the semantic can't be applied to the container?  Does an
address on an envelope have no semantic meaning?  Why the shift in
context to determine what has "meaning"?

I think you just proved Len's point: the separation of presentation
from content is "tricky". Distinguishing between "means" and "stands
for" is pointless in my book (they both have the same meaning ;-)...

<snip/>

> semantic and structuration are just conventions ; it is also the case in
> any natural language (which is not as natural as it seems)
> 
> you may find conventions at a world-wide level (standards)
> you may fing conventions at a corporate level
> you may fing conventions at an application level
> 
> structured and semantic informations are just where one decide to apply
> them... by convention !
> 
> if you ignore one level of convention, you may loose structure or
> semantic : if you give me an access point to your well-designed
> database, and you omit to tell me that a given colomn contains a blob
> that is XML, I will find binary datas (if I'm curious, I could try to
> parse all the blobs and may find XML)
> 

Exactly.  As I was saying; measures of semantic content or structure
are in the eye of the beholder...




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS