xml-dev - RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

[ Lists Home | Date Index | Thread Index ]

To: "'Stephen D. Williams'" <sdw@lig.net>
Subject: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
From: "Alessandro Triglia" <sandro@mclink.it>
Date: Mon, 19 Apr 2004 18:20:02 -0400
Cc: "'Bullard, Claude L \(Len\)'" <clbullar@ingr.com>, "'XML DEV'" <xml-dev@lists.xml.org>
Importance: Normal
In-reply-to: <40843DE0.6070107@lig.net>



> -----Original Message-----
> From: Stephen D. Williams [mailto:sdw@lig.net]
> Sent: Monday, April 19, 2004 17:00
> To: Alessandro Triglia
> Cc: 'Bullard, Claude L (Len)'; 'XML DEV'
> Subject: Re: [xml-dev] Validation vs performance - was Re: 
> [xml-dev] Fast text output from SAX?
> 
> 
> >
> >However, I suspect that many applications are being built around a
> >schema (now often XML Schema) in such a way that they will 
> not tolerate
> >any variations to the form of XML document that does not
> conform to the
> >schema.
> >  
> >
> I mentioned XML consisting of idioms along with syntax and other
> explicit standards.  One powerful idiom that has become accepted and 
> expected with XML is that, whenever at all possible, you produce 
> precisely but accept loosely.  


I think you are being too vague here.  There must certainly be **rules** on
how and when you can be loose.  

If you expect an attribute "age", will you accept an attribute "Age"
instead?

If you expect an attribute "age", will you accept a child element <age>
instead?

If you expect an attribute "abc:age", will you accept an attribute "age"
(unqualified) instead?

If you expect two child elements <a> and <b>, will you accept a <c> in
between?  What if you have a <d> before the <a> but the mandatory element
<b> is missing altogether?  Will you accept and know how to handle this
situation?

Will you accept a namespace name "abcde" when a schema specifies the
namespace name "abcdef", or "abcdef/", or "ABCDE"?

Are you saying that it has become a common idioma in application development
to expect and accept one or more of the above?  I cannot believe you really
mean that.

I do agree that you may want to tolerate an addition to an element (say, an
extra child element, usually at the end of the "expected" content, or an
extra attribute).  So what?  ASN.1 supports this **formally** in the type
definitions.  It has done so for many years.  There is even a clause that
you can place at the beginning of an ASN.1 module and means that one must
expect additions anywhere.


> This is a direct expansion of the IETF 
> meta-rule that states a similar principle.  Furthermore, when 
> accepting 
> loosely and re-emitting an existing document/object, you attempt to 
> preserve anything originally present, even if you didn't 
> expect it.  For 
> instance, an 'object', i.e. a complex data type, may have grown a new 
> field.  You should not die when encountering this field and 
> if you are 
> modifying and exporting that object, you should preserve the field.


Exactly.  This is what the ASN.1 notion of extensibility was invented for.

  
> This is a very powerful way to support schema evolution, router or 
> separation of concerns patterns, and many other cases where 
> you do not 
> have a fixed or completely shared schema/IDL.  It is for 
> these and other 
> reasons that IDL-based development, fixed schemas, and 
> native-language 
> object representation of what could be called "network 
> business objects" 
> is suboptimal in terms of development and maintenance 
> requirements.  If 
> it can be further proved to be suboptimal in terms of processing 
> efficiency, one of my goals, a paradigm shift is in order.
> 
> >...
> >
> >Does it make any sense to compare ASN.1 with XML Schema?  
> Probably yes.
> >  
> >
> The evolution of thinking that led to ASN.1 and later to XML 
> Schema et 
> al is very interesting, but it's not that helpful to 
> deconstruct it all.
> 
> ASN.1 is, of course, a logical data/API definition syntax, 


ASN.1 defines no API whatsoever, nor do ASN.1 modules define APIs.  ASN.1
modules define **data types**.


> normally 
> compiled into code and data structures.  


Normally, but not always.

However:

1) I can see this being done very often with XML Schema as well.

2) Other interfaces to ASN.1, such as SAX2, are perfectly possible.  From
each XML document that conforms to an ASN.1 schema, you can obviously
generate a stream of SAX2 events.  You can decode the XML instance into a
"value" and re-encode the "value" in BER or PER, without the application
being aware.  If an ASN.1 tool supports this, you can write an application
based on SAX2.  The application will be mostly agnostic of which encoding
rules are in use, and you can even change the encoding rules dynamically at
runtime.  The application will receive the **same** SAX2 events regardless
of the actual encoding rules being used, be they XER, PER, BER, etc.

Remember that one of the basic principles of ASN.1 is that applications
should not be affected by the on-the-wire representation.  This is a great
separation of concerns!!!


>XML Schema has a dualism between value and lexical representation, 
> >which is not very far from the ASN.1 dualism between value and 
> >encoding.  The main differences are:
> >
> >1) In XML Schema, the concept of value only exists for simple types, 
> >whereas in ASN.1, the concept of value exists both for complex types 
> >and for simple types.
> >  
> >
> Any 'subtree' in an XML Schema is equivalent to a 'complex 
> type'.  Same 
> structure, same thing.
> I posit that there are many instances where you want a 
> partial schema or 
> no schema at all.


This is unrelated to what I was saying here.  I was saying that XML schema's
concept of "value" is limited to simple types.  In XML Schema, you don't
talk about the "value space" and "lexical representation" of a complex type.
In ASN.1, instead, there is a concept of "value" that applies both to simple
types and to complex types.  In ASN.1 there is a "value set" associated with
each type (simple or complex), and a "value" is a member of that set.  You
can meaningfully talk about a "value of a simple type" as well as a "value
of a complex type".

I was simply pointing out one difference in terminology and concepts.  I was
not implying that one is superior to the other because of this.

Alessandro Triglia
OSS Nokalva

<<attachment: winmail.dat>>

Follow-Ups:
- RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
  - From: "Stephen D. Williams" <sdw@lig.net>
- RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: "Bob Wyman" <bob@wyman.us>

References:
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
  - From: "Stephen D. Williams" <sdw@lig.net>

Prev by Date: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Next by Date: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Previous by thread: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
Next by thread: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Index(es):
- Date
- Thread