Lists Home |
Date Index |
John Cowan <email@example.com> wrote:
> Rick Jelliffe scripsit:
> > That makes it clear that control characters are unlike other characters,
> > for which Unicode provides "semantics". The only C0 or C1 characters for
> > which Unicode provides "semantics" are TAB, CR, LF and NEL.
> XML already, however, allows the use of undefined codepoints, which have
> far less semantics than the C0 controls. And a good thing too, or
> Ethiopic and Thaana and Canadian Aboriginal Syllabics would be totally
> locked out of XML (they are post-Unicode-2.0) instead of merely
> banned in XML names.
Undefined codepoints have the semantic of "potential site for a future
Unicode character codepoint". It seems to me unlikely that Unicode will
assign any additional character semantics to the C0 and C1 blocks, making
the allowance for C0 controls in XML of dubious value as a "future-proofing"
-Peter S. Housel- firstname.lastname@example.org http://members.home.com/housel/