Lists Home |
Date Index |
On Nov 22, 2003, at 6:54 PM, Alaric B Snell wrote:
> Jonathan Borden wrote:
>> On the contrary, I'd say that only those folks who are concerned with
>> binary goo are concerned with encodings. Folks that deal with text,
>> i.e. XML, don't generally have to be concerned (to a significant
>> extent) with encodings, which is the *big win* of XML.
> Hmm? XML is just an encoding.
That's the crux of this thread. Is the message that is transmitted
*what is transmitted* or rather *something else* that is encoded in the
message. If you are primarily looking from the vantage point of an
application which communicates with another application *as if via
RPC*, then the interface is primary and the bits on the wire are
secondary. On the other hand if you are sending a document from one
place to the other, then the document is primary. XML was not designed
to be the "perfect" RPC protocol. XML remains a great way to "encode"
documents -- saying this seems like a tautology -- because to a large
extent the XML *is* the document.
>> Right, so you could care less whether a number is encoded either:
>> 1) big endian floating point
>> 2) little endian double
>> 3) big endian 64 bit integer
>> XML could really care less about these binary details. XML could just
>> as easily deal with 203bit integers as 23 bit integers. These binary
>> details just don't matter.
> Ah! You're saying that there are several representations of a number
> in binary but only one in text, right? Wrong. Some textual number
> parsers will accept 2.3E5 as a number, some won't. There are different
> encodings for numbers like infinity.
> And consider how one encodes dates in XML. And how do you encode a
> person's address book entry? With <person><name>...
> <email>...</person> or with <addressBookEntry name="..." email="..."
Now you are dealing with so-called XML _datatypes_, which only exist in
terms of applications layered on XML such as XML Schema or RDF. What I
am saying is that _for XML_ any so-called datatype is just another
piece of XML.
> It's wrong wrong wrong to say that in XML there is never any *choice*
> of how you 'encode' some abstract information. XML mandates a single
> way of encoding tree structure, but that's it. How you represent
> integers and dates are constrained a bit by schema languages, sure,
> but so do binary encodings like BER.
> And don't forget than in XML you have to care how your characters are
> encoded - big endian or little endian UTF-16? :-)
>> Right, so just deal with text and be done with all these trivial
>> concerns about byte order etc.
> Sorry, the XML spec mandates that parsers be able to read UTF-16
> encoded XML, which means you *do* need to be concerned with trivial
> concerns about byte order.
well yes ... sigh ... I have somewhat tracked these character issues
from time to time, but frankly leave these issues to the XML
cognoscenti as well as my parser.
To a very large extent *I* don't have to be concerned with these
> And if you want to use an off-the-shelf XML parser to prevent you from
> having to worry about those details - then use an off-the-shelf BER
> parser so you don't need to care about endianness in binary, either.
Off the shelf BER parser ... where do I get one of those ... do I have
to install it on my machine? Will the person receiving the message
understand BER? Frankly I am sure that ASN.1++ could have solved all
the technical issues ... this discussion seems oddly analogous to
TCP/IP vs. OSI as a network protocol. The real issue is *mindshare* for
which ASN.1 doesn't compare with XML ... I assume you et al. are trying
to correct that.
If you are trying to convince me (and I *am* someone who might be
convinced) you are going to need to:
a) make ASN.1 as easy as XML to work with
b) make it as easy for the mythical "grad student" to write an ASN.1
parser as it is to write an XML parser
c) talk to me in my language
>> The crux of the issue is whether you are more concerned with being:
>> a) abstract "high level"
>> b) bitwise efficient
>> The only reason to be concerned with binary goo is if you have an
>> overriding concern regarding "efficiency"
> Well, no, most people develop binary formats because they're simpler
> than XML, and they'd rather be getting on with writing their
> application than bothering with DTDs and SAX and DOM and stuff, in my
> experience. The only reason to be convered with XML goo is if you have
> an overriding concern regarding 'being able to view and edit the files
> in a text editor'!
Well fine, I expect that the market for binary applications will
continue to exist -- how does this so-called "binary XML" help with any
> [you write apps in XSLT]
> > What does this have to do
>> with binary encodings etc? What does "encoding syntax" have to do
>> with any of this?
> Nothing - because XSLT is actually hiding the implementation of that
> XML data from you; you just access the tree with it. The XSLT engine
> you use could quite easily operate on a binary encoding syntax or
> something and you wouldn't need to rewrite the XSLT. This is what Bob
> is saying is a Good Thing. Don't you agree with him? :-)
Theoretically, sure... I've made that argument at least since 1998 when
I wrote an "XML parser" for DICOM ... I think I lost it (really!)
because although it makes a great theoretical point, it isn't practical
... you generally need to rewrite your XSLT for any significant change
in document format. Another example of this encoding transform trick is
XMTP http://www.openhealth.org/xmtp . This discussion is really a
rehash of the SGML "GROVE" concept.