OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Xqueeze: Compact XML Alternative

[ Lists Home | Date Index | Thread Index ]

On Wednesday 05 February 2003 06:15, Tahir Hashmi wrote:
> On Tue, 04 Feb 2003 09:07:20 -0500
> Chiusano Joseph wrote:
> > Would you have a sense of how this binary representation comparse with
> > that of ASN.1 [1]?
> ASN.1 is, from what I gather, a method of specifying data formats in
> communication protocols and there are facilities for interconversion
> between ASN.1 and XML but I'm not sure whether one can generate SAX
> events or construct a DOM tree directly from an ASN.1 format document.
> They say that no tree information is stored in the binary encoding[2].

No, the tree information is there explicitly in BER and CER and DER and XER; 
PER, the Packed encoding rules, cannot be comprehended without reference to 
the schema. However, only in XER does the encoded tree information have node 
*names* in, in BER and CER and DER you still need reference to the schema to 
get the names of things. The names are redundant information for transfer.

But given the ASN.1 schema, you have tree structure, type information, names, 
the lot.

> Apparently, you can generate an ASN.1 binary encoding for your
> language specification


> and there are tools that will generate parsers
> in several languages for that format[3]. The problem here is that
> you'd have to modifiy your parser every time you make changes to your
> specification. This is a major headache, IMHO, since you'll have to
> bundle a parser with your applications and off-the-shelf parsers would
> be unusable.

You only need to rebuild the parser if you want to take advantage of the new 
features in the spec. If somebody has extended the spec and you don't care, 
the old parsers will still work; all of the encodings are backwards 
compatible to deal with changes. Indeed, ASN.1 is more extensible than XML in 
this respect; ASN.1 was designed for distributed loosely-coupled systems 
where specs can change versions without breaking everything, whereas XML has 
problems with specs changing under running systems. XML goes for a tighter 
coupling to versions; either you are very careful to write your own 
extensibility into your schemas, and use the same namespace for all versions, 
or you go for a new namespace for a new version of something and abandon 
backwards compatability. Hack, spit!

Anyway, yes - you spead of "modify your parser" - you don't mod the parser, 
you just rebuild it. The tool will turn your ASN.1 specification into the 

> With Xqueeze, all that you need to port an existing XML-based
> application to xqML based one is an off-the-shelf xqML parser at the
> consumer end and an xqML generator at the source end.

The ASN.1 toolkits are off-the-shelf too.

I think you're conflating two points here - when you're using ASN.1, you 
normally use the off-the-shelf toolkit to automatically make your parser, the 
same way flex and bison are used. There's no extra parser-developing effort 
involved, you just write the ASN.1 spec:

PersonName ::= SEQUENCE {
	firstName UTF8String,
	middleNames SEQUENCE OF UTF8String,
	lastName UTF8String,

...then in your makefile...

person_name.c: person_name.asn1
	asn1c person_name.asn1 -o person_name.c

...then in your code...

   PersonName *p = decode_PersonName (instream);
   cout << p->firstName << endl;
   cin >> p->firstName;
   encode_PersonName (p, outstream);
   destroy_PersonName (p);

However, before somebody goes "Nyah! That means ASN.1 is tightly bound and 
XML is better because you can parse a DTD at runtime", I'll point out that 
the concept of a *data format* forcing you to do stuff at 
runtime/compiletime/linktime/any other time is obviously stupid. You can also 
use an ASN.1 toolkit to parse ASN.1 schemas at run time and then produce 
them, not as native language data structures, but as something like a DOM 
tree. This API is used by, for example, arbitrary ASN.1 data viewers/editors, 
protocol debuggers, and so on.

In my example ASN.1 I've included an ellipsis as the last element in the 
PersonName type. That means 'expect future versions to add stuff here'. That 
means that even in PER, the packed rules, there is a single bit flag in the 
resulting binary encoding indicating the presence of extensions; if it's set 
then there's a length count in bytes followed by the extension data. That 
means that existing decoders can just skip over the extra stuff, while next 
generation decoders can use the extension bit to check for the existance of 
the optional version-2 fields. One of the version-2 fields is a bit to 
indicate the presence of version-3 fields, too, producing an arbitrary long 
chain of backwards-compatible extensions to the message format.


A city is like a large, complex, rabbit
 - ARP


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS