[
Lists Home |
Date Index |
Thread Index
]
> The interoperability is partly due to the fact that the content
> consists of Unicode characters, which have widely agreed on
> semantics as documented in Unicode and ISO 10646. However, the
> C0 controls do *not* have such widely agreed on semantics (what
> do ETX and EOD mean to you today?). And in general binary data
> is less interoperable than textual data. Thus it has no place
> in XML.
We often use XML to transport information whose semantics we do not
understand.
<a></a>
may well be gibberish, but so might
<a>oaaosiuc</a>
Both might mean something to somebody. It's not my job to judge; I'm only
the messenger.
>
> If you need to interchange binary data (and we all do) that's fine,
> but don't claim doing so is interoperable and don't try to dress
> it up in XML clothes unless you're willing to base64 it or otherwise
> clearly mark it as an opaque blob.
For occasional C0 characters appearing in the middle of printable text, the
XML character reference mechanism seems to be a good way of doing just that.
> Wouldn't it be about the
> same amount of work, and a lot cleaner, just to throw this
> stuff into base64?
If C0 characters only occur in 0.01% of the character strings that you
actually transmit, then base64 encoding is a heavy price to pay.
Michael Kay
Software AG
home: Michael.H.Kay@ntlworld.com
work: Michael.Kay@softwareag.com
>
|