Lists Home |
Date Index |
W. E. Perry wrote:
> draw a distinction between XML and any platonic
> abstract syntax such as ASN.1 and the like.
Yes, I'm a "platonist" as far as this discussion goes and I
suspect that many of the ASN.1 supporters are as well. I assume we're
talking about the "cave" here... In the cave, the schema is the form
and the shadows on the wall are the XML, BER, PER, etc. I may block
the light with my fingers and cause what appears to be the shadow of a
rabbit to appear on the wall. You may choose to see it as a rabbit and
that might be my intent, but then again it might not. You may see a
rabbit when what I wanted you to see was something different --
perhaps just a demonstration that even after 50 years, my fingers are
still flexible enough to bend as shown in the shadow. But whatever my
intent, and whatever you believe you see, I will always know that it
is merely the shadow of my fingers.
But, I'm also Aristotelian in that I think that "The Rhetoric"
is relevant reading whenever one is dealing with communications
systems. As Aristotle teaches, it is important for the speaker to
understand the warrants of his audience. Both speaker and audience
must share a common understanding of the speech and the context in
which it is made -- otherwise, what is heard may not be what was said.
If I wish to show flexible fingers, yet you see rabbits, we have not
communicated... Schemas help us communicate and share warrants -- they
are the ground, implicit or explicit, upon which all communications
are built. Of course, by communications, I mean "utterances" that are
made with "intent." I do not believe it useful to consider all
processes which cause modifications in the entropy of remote systems
to be "communications".
> the 50 years or longer struggle in the 20th century
> that was required for classical philology to
> understand the nature of oral poetry demonstrates
> why the physical, rather than any abstract nature of
> a text is worth insisting upon.
You declare here the end of a battle that still rages. Perhaps
*you* have accepted this view, but many others -- including myself,
have not. But, the subject here is interprocess communication -- not
poetry. Thus, I will resist the temptation to flame about what passes
as teaching in the literature departments of today's universities...
> ASN.1 and abstract syntax generally are incapable
> of a precise and unambiguous encoding of inherent
> fundamental textual properties without resorting
> to a priori agreements between the creator and the
> consumer of a document, and from the very nature of
> document processing such agreements are unreliable
> and negligible.
This is simply not true. Technical inaccuracies do little to
advance your cause... There is no question that an ASN.1 schema can be
defined that encodes, without loss, all of the emanent qualities of
any particular piece of text. And, it need not do it with any more
ambiguity than exists in a textual encoding where questions of
spelling, lexicography, the formation of letters, etc. all introduce
ambiguity into most texts. The difference here may be between the
"explicit" schemas which are presented with languages like ASN.1 and
the "textual" encodings which typically rely on "implicit" schemas
that are the sum of massive amounts of schema-like material in the
form of shared experiences, agreement on language, the semantics of
words, context, etc.
> the fundamental distinction of document and
> data to which all permathreads return
XML, ASN.1, BER, etc. are all used, in one application or
another, to deal with both documents and data. You seem to argue for
rules that are mostly applicable to these things you call documents --
to the exclusion of solutions that make data interchange as easy and
accurate as it might be. It seems that to maintain this position, you
would have to insist that XML is an inappropriate tool for "data
applications." Yet, you appear to be taking the alternate rule of
insisting that all XML based data systems be forced to accept the
"fundamentally" different constraints of a document-oriented system.
How can you justify this? Why do you care what the "data people" do if
your concern is so clearly focused on non-data things?
If data and document are as fundamentally different as you say
they are, then we, and you, should allow and encourage distinct sets
of rules and tools for dealing with each. (But, no more distinct than
necessary! Reuse is good...) Perhaps with "documents" one would not
use schemas -- but with data it is often an exceptionally good idea to
Given that XML is used for both "documents" and for data, I
suggest that the folk concerned with documents would be well advised
to define schemas for their work -- if only to prevent the "data"
people from making the mistake of thinking that they understand some
"implied" schema in the documents. For instance, if you are one of
those that insists that an element whose value is "00123" should be
passed *with* the leading zeros intact, then define a schema that
explicitly defines the element as a character string or use a schema
language that lets you say "don't trim leading zeros." At the same
time, if I am working with data which a schema says is an INTEGER --
whether or not it has leading zeros, then I should be free to ignore
these pesky wasted of bits. i.e "The Schema shall set us free!" by
allowing us to make the document/data distinction when necessary.
(Note: I do agree that by default, in the absence of an explicit
statement in a schema, the values in an arbitrary chunk of XML should
be treated as character strings.)
Perhaps in this conflict between "document" and "data" we are
like the English and the Americans peoples were in the eyes of Winston
Churchill -- Churchill said we are: "two peoples separated by a common
language." If so, let us come together.