OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Compiled XML

[ Lists Home | Date Index | Thread Index ]


Dear All,

Let me explain the ISO/MPEG-7 context and the solution that MPEG 
developped to handle this "compression" issue. Hope this can be 
of your interest.


MPEG-7 THE CONTEXT

MPEG-7 is a very large XML language (700 XML Schema types) to define 
audiovisual metadata. It is the result of the fruitfull effort of many 
companies and national bodies all around the world.

MPEG-7 is composed of several parts : 
 Part 1 - Systems
 Part 2 - DDL
 Part 3 - Visual descriptor
 Part 4 - Audio descriptor
 Part 5 - Multimedia Description Scheme
 Part 6 - Conformance

MPEG-7 main goal is to describe audiovisual content at different
level of granularity ranging from very low level description (mean 
color, aso..) to high level description (semantic relationship, actor 
names, copyright information, etc..). MPEG-7 adopted XML to represent 
these metadata and choose XML schema as its schema language. However, 
because bandwith is very expensive in the broadcast industry and 
because MPEG-7 description are possibly very large, MPEG-7 definitively 
had to define a "compiled version of XML".


MPEG-7 BINARY FORMAT - BiM

The part 1 (Systems) of the standard defines a Binary format for XML
documents called BiM.

BiM relies on the XML schema definition of an XML language to 
automatically generate a very compact binary format of that 
language. Elements and attributes are encoded with few bits, 
while values (leaves) are encoded using dedicated encoder 
(IEEE-754 for float, UTF_8 for strings, ...). BiM supports 
most of the XML Schema features including sub-typing (xsi:type), 
substitution groups, aso. BiM is generic as it can deal with any 
XML language, not only MPEG-7.

As its main features, BiM generates a very compact representation 
of an XML document that includes information to considerably 
speed-up search or filtering. It is streamable which means that 
document deltas can be send to update a remote version of an XML 
document.

This simple encoding scheme have proved to be very efficient. On 
recent tests a BiM decoder is between 10 and 30 times faster than 
Xerces C SAX parser for producing SAX-events. In case of direct 
parsing it can be between 20 to 100 times faster. File size can be
reduced up to 80%. BiM performs as well on small files as in large 
files and it can be combined with zip to outperfom zip compression 
by a factor of 2 to 5.

As a conclusion, BiM technology is very well suited to environment 
where bandwith is expensive or where large number of XML documents
have to be parsed. It is very well dedicated to the TV or the mobile 
industry.

The MPEG-7 (ISO 15938) will be published in few weeks as an ISO 
international standard. 

You can find more information on the official MPEG website:

	http://mpeg.telecomitalialab.com/

Some information about BiM can be found on :

	http://www.expway.tv/bim/bim.html

Best regards,
Claude.

_________________

Michael Rys wrote:
> 
> SQL Server 2000 uses a tokenized, binary XML format if it talks to an
> OLEDB 2.6 or higher provider that then turns it into XML (in the stream
> mode). So yes, binary XML formats do work and are being widely deployed.
> They save space (although I agree that using compression on the wire is
> normally better), they avoid to/from text serialization etc. Only
> problem is that any standardized format will most likely not be useful
> for most use cases since it will not cover the specific needs (it would
> be a compromise and thus basically useless).
> 
> There are several papers at WWW9 and WWW10 on general XML compression
> and ATT did some research on XMill. Also some tools basically use the
> DOM API (some persistent DOMs), SAX event streams (push) or XMLReader
> (pull) interfaces to avoid the serialized form.
> 
> Best regards
> Michael
> 
> > -----Original Message-----
> > From: Alaric Snell [mailto:alaric@alaric-snell.com]
> > Sent: Wednesday, March 27, 2002 5:02 AM
> > To: Mike Champion; xml-dev@lists.xml.org
> > Subject: Re: [xml-dev] Compiled XML
> >
> > On Wednesday 27 March 2002 12:53, you wrote:
> > > 3/27/2002 6:50:59 AM, Alaric Snell <alaric@alaric-snell.com> wrote:
> > > >Hi, Mike! How's the weather? :-)
> > >
> > > Uhh, lousy, especially compared to Spain last week :~)
> >
> > Shame, it's getting quite nice here in London now...
> >
> > > The response on this list to the Binary XML discussions
> > > has typically been "sounds plausible in theory, I've
> > > never seen it work well enough in practice to adopt."
> > > I don't have an axe to grind in this discussion other than
> > > wanting to answer a very frequently asked question for
> > > a newcomer.  Could you share some empirical, quantitative
> > > data from real success stories using these techniques?
> > > Did you really make significant speed improvements without
> > > memory bloat, or significant reduction of memory requirements
> > > without additional processor horsepower?
> >
> > Well, there was some comment from the Coccoon2 people about a
> serialised
> > SAX
> > event stream format, and the ASN.1 folks have been having fun
> comparing
> > BER/PER (ASN.1 encodings) with something called Megaco, which is an
> XML-
> > like
> > tree structured textual description too...
> >
> > I'll implement what I described in an earlier post, since it'll be a
> > useful
> > open source tool anyway I'm sure, and do some file size / run time
> tests
> > at
> > the weekend.
> >
> > ABS
> >
> > --
> >                                Alaric B. Snell
> >  http://www.alaric-snell.com/  http://RFC.net/
> http://www.warhead.org.uk/
> >    Any sufficiently advanced technology can be emulated in software
> >
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

-- 

______________________________________________
** NEW ADDRESS **
- - - - - - - - - 

Claude Seyrat

EXPWAY
17, rue du Pont aux Choux
75003 Paris, France
T: +33 1 44 54 29 28
M: +33 6 07 66 26 63
F: +33 1 44 54 90 49
E: claude.seyrat@expway.fr

www.expway.tv




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS