Lists Home |
Date Index |
Let me explain the ISO/MPEG-7 context and the solution that MPEG
developped to handle this "compression" issue. Hope this can be
of your interest.
MPEG-7 THE CONTEXT
MPEG-7 is a very large XML language (700 XML Schema types) to define
audiovisual metadata. It is the result of the fruitfull effort of many
companies and national bodies all around the world.
MPEG-7 is composed of several parts :
Part 1 - Systems
Part 2 - DDL
Part 3 - Visual descriptor
Part 4 - Audio descriptor
Part 5 - Multimedia Description Scheme
Part 6 - Conformance
MPEG-7 main goal is to describe audiovisual content at different
level of granularity ranging from very low level description (mean
color, aso..) to high level description (semantic relationship, actor
names, copyright information, etc..). MPEG-7 adopted XML to represent
these metadata and choose XML schema as its schema language. However,
because bandwith is very expensive in the broadcast industry and
because MPEG-7 description are possibly very large, MPEG-7 definitively
had to define a "compiled version of XML".
MPEG-7 BINARY FORMAT - BiM
The part 1 (Systems) of the standard defines a Binary format for XML
documents called BiM.
BiM relies on the XML schema definition of an XML language to
automatically generate a very compact binary format of that
language. Elements and attributes are encoded with few bits,
while values (leaves) are encoded using dedicated encoder
(IEEE-754 for float, UTF_8 for strings, ...). BiM supports
most of the XML Schema features including sub-typing (xsi:type),
substitution groups, aso. BiM is generic as it can deal with any
XML language, not only MPEG-7.
As its main features, BiM generates a very compact representation
of an XML document that includes information to considerably
speed-up search or filtering. It is streamable which means that
document deltas can be send to update a remote version of an XML
This simple encoding scheme have proved to be very efficient. On
recent tests a BiM decoder is between 10 and 30 times faster than
Xerces C SAX parser for producing SAX-events. In case of direct
parsing it can be between 20 to 100 times faster. File size can be
reduced up to 80%. BiM performs as well on small files as in large
files and it can be combined with zip to outperfom zip compression
by a factor of 2 to 5.
As a conclusion, BiM technology is very well suited to environment
where bandwith is expensive or where large number of XML documents
have to be parsed. It is very well dedicated to the TV or the mobile
The MPEG-7 (ISO 15938) will be published in few weeks as an ISO
You can find more information on the official MPEG website:
Some information about BiM can be found on :
Michael Rys wrote:
> SQL Server 2000 uses a tokenized, binary XML format if it talks to an
> OLEDB 2.6 or higher provider that then turns it into XML (in the stream
> mode). So yes, binary XML formats do work and are being widely deployed.
> They save space (although I agree that using compression on the wire is
> normally better), they avoid to/from text serialization etc. Only
> problem is that any standardized format will most likely not be useful
> for most use cases since it will not cover the specific needs (it would
> be a compromise and thus basically useless).
> There are several papers at WWW9 and WWW10 on general XML compression
> and ATT did some research on XMill. Also some tools basically use the
> DOM API (some persistent DOMs), SAX event streams (push) or XMLReader
> (pull) interfaces to avoid the serialized form.
> Best regards
> > -----Original Message-----
> > From: Alaric Snell [mailto:email@example.com]
> > Sent: Wednesday, March 27, 2002 5:02 AM
> > To: Mike Champion; firstname.lastname@example.org
> > Subject: Re: [xml-dev] Compiled XML
> > On Wednesday 27 March 2002 12:53, you wrote:
> > > 3/27/2002 6:50:59 AM, Alaric Snell <email@example.com> wrote:
> > > >Hi, Mike! How's the weather? :-)
> > >
> > > Uhh, lousy, especially compared to Spain last week :~)
> > Shame, it's getting quite nice here in London now...
> > > The response on this list to the Binary XML discussions
> > > has typically been "sounds plausible in theory, I've
> > > never seen it work well enough in practice to adopt."
> > > I don't have an axe to grind in this discussion other than
> > > wanting to answer a very frequently asked question for
> > > a newcomer. Could you share some empirical, quantitative
> > > data from real success stories using these techniques?
> > > Did you really make significant speed improvements without
> > > memory bloat, or significant reduction of memory requirements
> > > without additional processor horsepower?
> > Well, there was some comment from the Coccoon2 people about a
> > SAX
> > event stream format, and the ASN.1 folks have been having fun
> > BER/PER (ASN.1 encodings) with something called Megaco, which is an
> > like
> > tree structured textual description too...
> > I'll implement what I described in an earlier post, since it'll be a
> > useful
> > open source tool anyway I'm sure, and do some file size / run time
> > at
> > the weekend.
> > ABS
> > --
> > Alaric B. Snell
> > http://www.alaric-snell.com/ http://RFC.net/
> > Any sufficiently advanced technology can be emulated in software
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
** NEW ADDRESS **
- - - - - - - - -
17, rue du Pont aux Choux
75003 Paris, France
T: +33 1 44 54 29 28
M: +33 6 07 66 26 63
F: +33 1 44 54 90 49