xml-dev - RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

[ Lists Home | Date Index | Thread Index ]

To: "'Stephen D. Williams'" <sdw@lig.net>
Subject: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
From: "Alessandro Triglia" <sandro@mclink.it>
Date: Tue, 20 Apr 2004 01:34:14 -0400
Cc: "'Bullard, Claude L \(Len\)'" <clbullar@ingr.com>, "'XML DEV'" <xml-dev@lists.xml.org>
Importance: Normal
In-reply-to: <4084A525.1070400@lig.net>

> -----Original Message-----
> From: Stephen D. Williams [mailto:sdw@lig.net] 
> Sent: Tuesday, April 20, 2004 00:21
> To: Alessandro Triglia
> Cc: 'Bullard, Claude L (Len)'; 'XML DEV'
> Subject: Re: [xml-dev] Validation vs performance - was Re: 
> [xml-dev] Fast text output from SAX?
> 
> 
> Alessandro Triglia wrote:
> 
> ... 
> If you have the following type definition:
> 
> --------------------------------------------------
> PersonInformation ::= SEQUENCE {
> 	name UTF8String,
> 	age [ATTRIBUTE] INTEGER OPTIONAL,
> 	age-elem [NAME AS "age"] INTEGER OPTIONAL,
> 	address UTF8String,
> 	...
> }
> --------------------------------------------------
> 
> the ellipsis ("...") indicates a position in the sequence 
> (usually but not necessarily at the end) where an "addition" 
> may occur.
> 
> A writer may add stuff after the "address" field.  PER wraps 
> the stuff and adds a length prefix right before it.  A reader 
> will be able to either decode the stuff (if he thinks it 
> understands it) or skip it.  If a reader decides to relay the 
> message, it can re-include the wrapped stuff in the message, 
> whether or not it understands the stuff.
> 
> Without the ellipsis, it is forbidden to add anything after 
> the address. With the ellipsis, the wrapping mechanism 
> ensures that any addition will be recognized as such, whether 
> or not it is understood.
>   
> That's cool, but I still don't understand how fields are 
> identified in a way that is usable and supportable in other 
> than tightly coupled environments.  (Where tightly coupled 
> includes standards like H.323, SS7, cell networks, etc.)

I haven't told the whole story.  I am not sure that everybody in this list
is willing to hear it.

Very shortly:  The ellipsis can be used in two different contexts.  The
example I gave illustrates extensibility over time (versioning of data type
specs).  ASN.1 also supports extensibility "across space" or across
applications, domains, etc., with a different (and more complex) syntax.

ASN.1 can specify not only types, but also a set of abstractions whose main
purpose is to support the so-called "table constraint".  In short, a table
constraint relates two or more fields of a data structure based on their
contents.  It is a co-constraint, but does not constrain only the *values*
but also the *types* of those fields (starting from an "unspecified"
variable type called "open type" and constraining it to be a specific type).

A simple example of use of a table constraint is a SEQUENCE containing an
integer field and an open-type field (among others).  The table constraint
could link the open-type field to the integer field, so that, for example:

- if the integer field has the value 6, the open-type field must contain a
value of type MyTypeA;

- if the integer field has the value 47, the open-type field must not be
present in the instance;

- if the integer field has the value 17, the open-type field must contain a
value of type MyTypeB;

- if the integer field has the value 2, the open-type field must contain a
value of type SEQUENCE {a REAL, b REAL};

and so on.

Note that this is just an example, and there may be more than 2 "columns" in
the table, and there may be fixed types other than INTEGER, and the variable
types may be of any complexity (there is still more).

The ellipsis can be used to indicate that the table (of the table
constraint) is dynamically extensible.  This means that any party can add or
remove rows to the table, even at runtime.  Or you can write an application
that uses a different set of rows depending on the domain or on other
things.  Index fields (most commonly integers, or object identifiers, or
relative object identifiers, or strings) indicate what type is being used in
another field of the data structure.  If a party does not understand a
particular value of the index field, it can do whatever it finds appropriate
(or whatever a standard says).  Delaying the encoding or decoding of an open
type is a common event.  

It is usual for ASN.1 tools to support this mechanism directly.  A tool will
usually be able to resolve the value of an index field of a table constraint
to a row of the table and then automatically encode or decode the variable
types based on the selected row.

> 
> If there exists an application as above and it is receiving 
> messages from multiple parties, say many different schools 
> submitting graduation records.  What if one school adds a 
> height field and another adds a languages field?  This 
> interface can see those fields without confusion and pass 
> them on to another tier?  Is there a way for this tier to 
> interpretively process those fields in a coherent way?
> 
> I think that you've shown that a receiver can handle 
> extensions from the one changer of formats which helps with 
> schema evolution to some degree, but it still doesn't seem 
> like loose coupling.
> 
> ... 
> <Sigh>  Make that "ASN.1 + xER + existing implementations".  
> Message format + semantic sugar = API or Message format + 
> protocol + semantic sugar = API Standard message format  
> implies  standard API Standard API  does not imply standard API.
>     
> Sorry, I still don't get it.  An API is not the same thing as 
> a data type. Why do you think they are the same thing?
> 
> Alessandro
>   
> A "remote procedure call" is the transfer of a message with 
> associated semantics that map that message to a method or 
> method stub and return the results with an associated 
> semantic.  A "protocol" is the transfer of messages with a 
> certain semantic where the processer of messages may 
> eventually call a method.  Although distinct models, they 
> aren't necessarily that far apart semantically.  Even a local 
> method invocation can be represented as a pair of messages.  
> Some paradigms extend this: SQL servers generally receive a 
> single request and may respond with any number of responses.  
> SOA methods like message passing/oriented point to point and 
> publish/subscribe have much more rich possible communication patterns.
> 
> To be clear, you can have data types that are just data types 
> such as "Person" and data types that represent an RPC/message 
> of some type such as: NewPerson{ Person np }, 
> NewPersonReponse{ ID newid, Success ok }.
> 
> It's better to define the message traffic, ok to define the 
> protocol also, and marginally useful to describe the 2GL API.

I didn't ask you what a protocol is, but I asked you why you think that an
API is the same thing as a data type.  Now you seem to think that a protocol
and a remote procedure call are also the same thing as an API and as a data
type???   Well, it's your opinion, but I profoundly disagree on that
equivalence.

Alessandro

> 
> sdw
> 
> -- 
> swilliams@hpti.com http://www.hpti.com Per: sdw@lig.net 
> http://sdw.st Stephen D. Williams 703-724-0118W 
> 703-995-0407Fax 20147-4622 AIM: sdw
>

References:
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
  - From: "Stephen D. Williams" <sdw@lig.net>

Prev by Date: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Next by Date: Re: [xml-dev] XUL Compact Syntax Study Now Online - Is XML too hard for Aunt Trudie?
Previous by thread: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
Next by thread: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fasttext output from SAX?
Index(es):
- Date
- Thread