OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Something altogether different?

[ Lists Home | Date Index | Thread Index ]

Thanks Ken.  Comments inline.

From: Ken North [mailto:kennorth@sbcglobal.net]

>> So where we do understand how the vector model
>> works for text analysis, do we understand how to apply
>> it to a *text* that includes video and audio as integral
>> parts of the *text* and can we combine these into a
>> higher level space vector term

>1. "Facilitating Video Access by Visualizing Automatic Analysis"

>"Metadata for video materials can be derived from the analysis of the audio
>video streams. For audio, we identify features such as silence, applause,
>speaker identity. For video, we find features such as shot boundaries,
>presentation slides, and close-ups of human faces."

This comes closest to what I would consider a good start as it uses gestural

significators.  It is in combination with other vocabularies that I think 
there is more bang for the buck.   If similarity metrics apply across the 
vocabularies (a gesture in any vocabulary gets the same vm signature), 
then the cues all reinforce each other and the intepretation probability 
goes up.  Of course, the problem of interpretation starts as soon as we 
assign metadata, begin to reason over that and it amplifies (self
assumptions - or GIGO).  The reason for comparison is to detect superstition
simply, garbage in the metadata induced by faulty observation.  Tough

>2. Yahoo has recently taken the RSS approach. Video RSS provides a text
>description such as height, width, bitrate and running time:

Which is ok for knowing something about the coffee cup but not the coffee.

>3. SQL implementations such as DB2 UDB support content-based querying over
>types. DB2 has an Image Extender and Audio Extender with correspondiong
>(DB2IMAGE, DB2AUDIO). The Audio Extender analyses the content and stores
>such as whether it's 16-bit audio, samples per second, playing time, the
>of clock ticks per quarter note and so on. The Image Extender stores
>that enables you to provide an image and search for matches based on color
>texture (contrast, directionality, etc.).

>IBM's CueVideo software uses speech recognition technology to generate text
>the audio tracks of videos -- which could then be fed into an engine that
>the vector space model and textual similarity matching described in my

Yes.  A useful source.  I suspect aided by the human eye, this is good.  We
similar products (Video Analyst) that also enhance images.

>4. This paper discusses analysis of digital music using similarity
>Media Segmentation using Self-Similarity Decomposition

>"....In this example, the sequence ABC is asignificantly shorter summary
>containing essentially all the information in the song."

As a songwriter and producer, I can refute that.  Let's just 
say that we work our bunnies off to make that wrong, but sometimes don't. 
The self-similarity of music is a cosmic d'oh.  "To play with only this 
and that old hat is such a bore, but I sadly fear the love of the ear is 
to hear what it heard before."  Digital editing makes it easy to grind 
out a self-similar production and it seduces one into not doing it by 
eliminating serendipitous opportunities and the fecundity of breath.

>3.1. Clustering via similarity matrix decomposition
>To cluster the segments, we factor a segment-indexed similarity matrix to
>repeated or substantially similar groups of segments."

Which means the musician/songwriter succeeded and failed.

Much of what I've learned over the years does come from my night gig, and
similarity plays a big role.  On the other hand, so does 'new and different'

and there is the length and tension of the performer's tightrope.  We depend
that for repeat business and it depresses us because it isn't just one thing

after another, but the same thing after another.  whinge....



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS