Lists Home |
Date Index |
Interesting set of concepts. Back to the old Enterprise view (StarTrek that
is) of trying to understand brave new worlds. Different worlds, cultures and
beings. When the world was a much more innocent place (not really of course).
On Sat, 30 Apr 2005 4:52 am, Bullard, Claude L (Len) wrote:
> If the value of indexing is expressed as the function of the
> density of objects in addressable space so that performance
> is inversely proportional to the space density (actually, the
> address space itself), XML vocabularies increase the
> density of the space as well as introducing ambiguity
> and uncertainty through semantic loading and can actually
> hurt the performance of the system. (yes|no ?)
No. there should be no such thing as a performance bottleneck in an enterprise
xml system. This tends to only happen in larger organisations.
A few years ago I contracted to a telco and worked on integration of payphones
into their central system.
The big surprise to me was that there was actually five databases that held
information about payphones in the land and it simply wasn't possible to do a
"select count(*) from payphones where (status="Active")". Just a simple thing
but absolutely not possible.
It was possible to know how many 20c pieces were collected nationwide in a
week, but not how many phones were in service at any one time.
Anyway, that's just one experience I have of computers in an enterprise at a
very large scale. It can be a mess and there is no easy way to sort it out. I
think the majority of xml development has been governed by engineers with
this sort of experience.
Down at the small business, things are the opposite.
Accounting systems usually store everything and delays in processing are
usually physical. The time to run up the stairs to the office to check the
Ambiguity is always handled by a human mental process and resolved by either
speaking softly or yelling down the phone at the other party. There's also
the classic deference strategy of "the cheque is in the mail".
So the two cultures, small organisation and large organisation are
diametrically opposed. The larger ones are process driven whilst the smaller
ones are sales driven.
> That's why Bosworth's presentation has merit. The problem
> however, is that it simply moves the calculation of the similarity
> metric away from the apriori schema declaration into raw
> microparsed vector results.
Hmm.. I'll have to feed this one to computer....
> A schema is the declaration of a
> space where occurrence indicators are a determinant of frequency
> and therefore, similarity given a rule that frequent terms are
> less important than rare terms within a document (term vectors),
> and more important across documents (document vectors).
In Chinese, they have this expression called "Chicken and Duck talk" where the
chicken speaks in it's language, and the duck in it's. They are both happy.
Whilst I never saw this in Star Trek, I think it would make for an interesting
future episode. Actually "Chicken and Duck talk" describes what is happening
with xml between large and small enterprises. Neither side really gets what
the other is saying.
I hope in the future that these different cultures can be bridged and that xml
is the path.
Computergrid : The ones with the most connections win.