OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] ASN.1 is an XML Schema Language (Fix those lists!)and Bina

[ Lists Home | Date Index | Thread Index ]

Robin Berjon wrote:

>> I think that some of the sophisticated encodings like PER are very
>> hard to get right and complete, too (I have never looked into these
>> encodings, so have no first-hand experience here).
> I don't think PER qualifies as "very hard", but yes it certainly 
> qualifies as much harder than BER.

Figuring out the code that generates PER encodings of things is tricky 
due to all the rules about getting rid of redundant things, but once 
you've done it all, the result is quite simple code, you'll be glad to 
know. PER encodings aren't all that complex - they're just finely tuned :-)

Code that reads PER might look something like:

Person PERReadPerson (InputStream in) {
	Person p = new Person ();
	p.name = PERReadArbitraryLengthString (in);

	p.age = PERRead8BitUnsignedInteger (in,0,200);
            // age was constrained to the range 0..200, so
            // an 8 bit unsigned integer is used.
            // Note that if it was given the range 100..300
            // then we would read in an 8 bit unsigned int
            // from 0 to 200, and add 100 to it.

         p.phone = PERReadShortString (in,10);
	   // The phone number field is a string of up to 10
            // chars, so uses a 'short string' encoding with a length
            // byte or something like that

         p.geneticFingerprint = PERReadFixedLengthLimitedAlphabetString
            (in, 16, "ATGC");
            // the genetic fingerprint is precisely 16 chars, with the
            // character set just being A, T, G, and C. The string is
            // encoded with two bits per character, so four chars per
            // byte, so four bytes.

	// Later versions of the spec may add more fields than we know
         // about, so we have a trailing extension flag
	boolean hasExtensionData = PERReadBoolean (in);

	if (hasExtensionData) {
		// skip it
		int length = PERReadInteger (in);
		in.seek (length);

Writing a general BER parser is, I reckon, probably a shade simpler than 
an equivelant XML parser, since you don't have entities and declarations 
and the differences between attributes and elements to worry about. All 
the children of your node are laid out in order in the same format, and 
every string is prefixed with its length (albeit possibly broken into 
chunks, each with a length prefix and a "more follows" flag, if the 
string was generated by streaming).

A decoder for our Person record in BER might look more like:

Person BERReadPerson (InputStream in) {
	Person p = new Person ();

	p.name = BERReadString (in);

	p.age = BERReadInteger (in);

	if (p.age < 0 || p.age > 200)
            throw new ValueConstraintException (...);

         p.phone = BERReadString (in);

	if (p.phone.length() > 10)
            throw new ValueConstraintException (...);

         p.geneticFingerprint = BERReadString (in);

	if (!p.geneticFingerprint.match ("^[GATC]{16}$"))
            throw new ValueConstraintException (...);
	// Later versions of the spec may add extra fields that we don't
         // know about. Skip past them until the end of sequence marker
         // is found.
	BERSkipToEndSequenceMarker (in);

Note that BERReadString etc. do not trust you that there will be a 
string due from the stream, as the PER ones do; they read the type code 
and if there's actually an integer or a sequence there, they throw an 

As I gather it, PER was devised at a time where ASN.1 was coming under 
fire because the most widely used encoding, BER, was "wasteful and 
bloated" since it contained the type and length codes on everything. PER 
is ASN.1's equivelant of "binary XML" ;-)



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS