Lists Home |
Date Index |
On Tue, Dec 18, 2001 at 04:54:39PM -0800, Michael Rys wrote:
> Tim, with all due respect, but allowing #x0-@x1F inside element and
> attribute content would tremedeously help users of XML that use non-XML
> string sources for their data and map it into XML without loosing
> fidelity and without having to base 64 encode otherwise normal strings.
> Most of these applications do not care about the semantics of ETX or
> EOM, but just that they are being preserved over the XML serialization.
> Example applications are: SOAP, XML database serializations etc.
> Best regards
I absolutely agree. Here is another concrete example of how I just got
bitten recently. We have an application with a Unicode string type.
Its a string - its not XML. We then want to do things with that string.
We want to store it in a database using XML as our object format.
We want to ship it across a SOAP API (which uses XML to encode packets).
We want to store it in a WebDAV property (again, uses XML).
The problem is we have to say "Yes its a Unicode string. Yes your string
has nothing to do with XML. But you cannot use all of Unicode because
all these other protocols are built on XML and XML does not allow all
Unicode values. So your strings must not contain control characters."
This would all go away if XML did not explicity define specific numeric
values of characters (code points, whatever) as being invalid.
For some of these protocols (WebDAV for example), base64 encoding is
not always an available option for every place you can insert a string.