[
Lists Home |
Date Index |
Thread Index
]
Michael Champion wrote:
> This gets into another point -- there is a widely-held
belief by many
> (most?) XML developers/analysts that XML *is* a text markup
format, so
> "XML objects" is an oxymoron. See
> http://lists.w3.org/Archives/Public/www-
tag/2003Oct/0036.html
> for the most recent outbreak of the long running debate on
this point .
I've read this discussion, started by Tim Bray, and I
must say that just as Tim Bray has disagreed "profoundly"
with some who dispute his contentions, I disagree profoundly
with Bray's position. While much of the history he cites is
accurate, his conclusions are questionable. He argues from
the specific to the general and ascribes cause to what may be
only anecdotal circumstance.
Bray observes that Internet standards, many of which have
been very successful, have relied almost exclusively on
specifications of concrete rather than abstract syntax. He
argues that there is a lesson to be learned here: That since
past successes have been based on concrete syntax, future
efforts should focus on concrete, not abstract, syntax. I
believe he is wrong.
The process of defining Internet standards has been very
ad hoc. As someone decides they have a need, they start a new
effort and put together a spec. These things are usually
defined largely in isolation of each other (there is no
real "Internet Architecture") and are typically done with a
goal of economizing specification effort. Since the goal of
such protocols is to permit interoperation, the specs
inevitably are heavily focused on concrete syntax -- the
minimum needed to ensure interoperability given the
environment.
Many developers of Internet protocols actually understand
their services to have very precise and well thought out
abstract models. However, a problem appears in translating
those models to concrete syntax. The problem is that there
are an essentially infinite number of ways to translate an
abstract model into concrete syntax. What you see in an RFC
or similar document is just one selection from the infinite
range of options.
Because most protocols are defined in isolation of each
other and because many implementation efforts are focused on
a single protocol -- rather than a suite of protocols, what
you end up with is a tower of Babel. Each protocol developer
defines concrete syntax based on personal experience and
preferences and nobody really minds too much since reuse
between protocols is more a cost to a new protocol definer
rather than something that reduces cost. Worse, each protocol
usually uses a slightly different method for defining the
concrete syntax -- thus, to understand a new protocol, you
must also come to learn the syntax and conventions used to
describe it.
Well, the groups that defined ASN.1 had a different view
of the world -- a view that has often been attacked as
overbroad, nonetheless, it resulted in a different approach
to the problems of protocol design and implementation.
ASN.1 was defined as part of the ISO process and was
explicitly intended to be used and reused in a broad range of
protocols. It was expected that implementors would be
required to work on a broad range of protocols and that work
done on one protocol shouldn't be duplicated on another.
Since the pecularities related to having massive numbers of
divergent concrete syntaxes was already a problem back in the
80's, ASN.1 was chosen as a mechanism to largely remove the
concrete syntax problem from the domain of protocol design.
Much of the complexity of ASN.1, when compared to an
application specific protocol, thus results from this desire
to optimize the broad realm of protocol design rather than
some specific and short-term protocol development problem.
ASN.1 provides completely unambiguous translations
between concrete and abstract syntax and thus allows protocol
developers to focus, as they should, on their abstract models
rather than on the problem of how to express those models on
the wire or on disk. A suite of deterministic encoders and
decoders are provided, each optimized for one or another
environment and each specified to the point where there is
little risk of failure to interoperate between
implementations. The result is that an "ASN.1 developer"
hardly ever even thinks about concrete syntax -- because they
don't need to. Concrete syntax is something that can be
chosen by the implementor as a configuration option, either
globally, or on an exchange by exchange basis. Concrete
syntax is not an essential part of the protocol design
process. One of the very nice side effects of this
deterministic translation is that developers can move from
one protocol to another and find that they aren't suprised by
or required to learn protocol-specific concrete syntaxes.
Once you learn ASN.1 and a bit about the encoder/decoders,
you don't need to learn anything new in order to understand a
new protocol. You could have a hundred protocols, each
designed by a different person, yet each would encode a
sequence or integer in exactly the same way...
Tim Bray's conclusion's might be valid if an alternative
didn't exist. As long as protocols are being specified in
isolation and reuse between protocols isn't valued, as long
as there is no way to translate deterministically from
abstract to concrete, then we're stuck having to concentrate
on the concrete syntax. But,... there is another way -- the
ASN.1 way. Agree on deterministic conversions from abstract
to concrete and remove most concerns about concrete encoding
from the design equation.
It should be noted that even though ASN.1 allows one to
define protocols abstractly, this doesn't mean that
implementors can't focus on the concrete syntax. There are,
for instance, many "quick and dirty" or simple
implementations of services originally defined in ASN.1 where
the developers haven't made use of reusable encoder/decoders
or ASN.1 compilers. What they've done is simply written code
that mimics what these tools would have generated. The point
is that with a system like ASN.1 you get the choice. You can
implement at the concrete level and use the ASN.1 definitions
in concert with an understanding of encoding rules to tell
you what code to write. Or, you can use a tool that does the
work for you. (Personally, I prefer letting tools do work for
me...)
I'm sure this too-long note won't put an end to the
concrete vs abstract debate that Tim Bray has reopened...
Such issues are so much fun to debate that people won't let
them go away.
bob wyman
|