OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Is HTML structured or unstructured information?

[ Lists Home | Date Index | Thread Index ]

I agree.  The blobs, the texts, the otherwise NOTATIONs, 
are the challenge to search systems that mine.  By 
releasing that code, IBM advances the practice.  

Markup, as the name alludes to, is not about normalization 
although from the computer's point of view, that is a 
good thing to achieve.  Should one markup only to 
the advantage of the computer?  It's an easy trap to 
fall into.

From one point of view, it is about capturing the 
structure and meaningful names organically present.  
In our younger days when some of us talked about using 
markup for modeling, some thought it a means to create 
a model, others a means to capture one.  Practically, 
XML doesn't care, but historically, I think the latter 
approach dominated until XML made markup a favorite 
among computer scientists.  

But..... if one tries to 'fix' an organically derived model, 
one may lose some of the meaning.  So even as we 
proseletyze good practices for structuring, we come 
back to naming as the practice to cultivate because 
idioms are meaningful.

Yes?  No?


From: Peter Hunsberger [mailto:peter.hunsberger@gmail.com]


> So while a relational database may be a rigorous
> example for structure, I don't think that is what
> the article cited is really about.  If one is looking
> for patterns, relationships, intentions, meaning, is
> it easier to get that from a relational database or
> from an XML instance?  It depends, in my opinion,
> on how predigested the content is, not the structure
> although the structure is useful for finding an
> answer where one already knows the question and
> the structure.  Questions are harder.

It seems to me that since both relational DBs and XML allow "escape to
BLOB"; either of them can have as much or little structure as one
cares to enforce.  IMO the largest difference is the larger body of
best practices for building up good structure in relational DBs.  Eg.
normalization theory for XML is still relatively immature by


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS