   RE: [xml-dev] Ode to the Sem-Web

 You want silliness? :)
[len]  Coals to Newcastle, Andrew.

The following "poetic gem" is said to have been discovered at the base of the piles on which the Tay Bridge was built.
[len]  An excellent bit of doggerel.  Compliments.  But seriously, why are you down on SemWeb?
1.  They are still working out the technology.  This will take awhile to do right at scale.
2. It is difficult to construct reliable applicable ontologies.   It is tedious work to sort out class relationships 
and get these defined in such a way as all or even many experts on the subject will accept them.  In HumanML, 
after a year of research, we ended up with a Primary Base of abstract types, most of which have definitions but 
few inheritable properties.  Why?  The domain of semiotics has lots of theories, two main camps (Sausserian 
and Peircian) and the differences range from mundane to profound although overall, they agree on the categories 
of properties that form the context for and affect human communication.   So as good markup weenies, on the 
first pass, all we can safely declare is categories.  After that, secondary derived schemas have to handle the 
details and each from the point of view of that particular camp.   What I am saying here is that it takes a lot
of expertise to do a good ontology, so someone has to become expert at sorting the bad ones.
3.   It won't grow fast without automated harvesting tools. You see the problem, of course.   It takes lots 
of work to get good markup structures for any one category in one domain, and as the autopoiesis experts 
point out, the structure tends to preserve itself regardless of the environmental pressure.   So for the SemWeb,
until the harvesting tools are working and delivering acceptable results, the data grows at a snails' pace.  Metadata
built by hand doesn't have the same payoffs as HTML built by hand.   HTML doesn't require assertion consistency. 
Very different games.
Still, we know that inferencing engines work tolerably well as long as we are not religiously or superstitiously 
attached to the results.   As long as we know how to frame questions well and completely understand that 
our framing of the question determines the results, we can use these tools and the semweb data for productive 
tasks.   If we, on the other hand, use the results like political wonks use polling results, we are likely to get 
into trouble (The Golem Problem).  I don't say that won't happen; it will.   But when it comes time to take 
after the monster, pitchforks aloft and torches blazing, we have to remember that the user of the information 
is the bad guy, not the technology.   My experience is that for every bad guy, there are at least ten good guys 
doing productive and creative work with that technology and those ontologies.  And in the Spy Vs Spy world,
telling good from bad isn't as easy as one might think.
Remember the Third Reich archaeologists.  They were scientists expecting to be taken seriously.  Outside 
the Third Reich, they were mostly laughed at by their peers.   On the other hand, one has the Japanese 
researchers in bio-warfare who after the war, went on the payrolls of many large nations and retired with honor 
and pensions.   Applications usually are as good or bad as the hands that apply them and the uses they
put them to, but also, who they use them for.  
Knowledge is defined as the ability to select the right choice given a set of equally probable choices. 
Seek to reduce Boltzman entropy in the world of human understanding.  Choose wisely. 


