OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Why would MS want to make XML break on UNIX, Perl, Python

[ Lists Home | Date Index | Thread Index ]

> Or we find an interoperable way to transport/encode the control
> characters (agree on entities or char references or PIs).

I would very much prefer to do this than to allow those naked codes to appear 
in text. I support the idea of finding a way to encode such data, rather than 
include it per se (as per Derek's suggested change in focus).

Numeric character references () are essentially the same as the literal 
data (once parsed the distinction is lost) so I would not support their use.

PI's, while being one mechanism, are application-specific, so are probably 
not ideal.

That leaves us with entities. Perhaps something along the lines of creating a 
"virtual" enitity set in the &Unnnn; space? This was suggested in the ERCS 

  "an XML 1.1 processor may interpret entity references beginning with the
   letter 'U', followed by 4 hexadecimal characters as representing an
   entity holding the representation of the Unicode Scalar equivalent of 
   the number."

This would provide a standard naming scheme for entities representing code 
points, but leave the exact resolved value undefined. No value is necessary 
anyway, as the entity reference provides all the needed information.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS