OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Some comments on the 1.1 draft

[ Lists Home | Date Index | Thread Index ]

On Wednesday 19 December 2001 02:24 am, Alan Kent wrote:
> To separate the two issues - I have no opinion on name characters.
> PCDATA however is different. I read through you entire post twice
> and must admit I still don't quite understand what your point is
> exactly. I *think* you might be saying "its good to specify the
> encoding because that way its possible to make sure characters
> not valid in that encoding are rejected." (My reading of the XML spec
> is that 0x85 is legal in the Unicode character set - that is, its
> not marked as UNUSED in the good old SGML jargon.)
> If this is your point, then would it be possible to define a new
> encoding which permitted the full range of Unicode characters
> (including control characters which are valid in Unicode).
> Would that address your issues?

The point is that characters != bytes != encoding. If you start allowing 
control characters (which are somewhat debatable *as* characters in the first 
place), it becomes very easy to abuse the power and to have 
application-specific uses of embedded encodings. This is effectively what Mr. 
Rhys from MS wanted: the ability to store arbitrary binary streams inside XML 
encoded data.

The problem is that XML is *text*. It is made from *characters*, and 
arbitrary binary strings have no place in it. Once you change that, you have 
essentially ruined XML as a textual markup language.

People could say that NUL et al. are still *characters* and so would be fine, 
even in UTF-8 encoded documents, but I bet they'd be rather unhappy to find 
their binary streams changing if I saved the document as UTF-16.

The point here is that these things are unreliable.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS