[
Lists Home |
Date Index |
Thread Index
]
> John Cowan wrote
>
> > This is a request for comment from this mailing list (or anyone else)
> > on a proposal by Shigemichi Yazawa for a standard representation for
> > the Unicode control characters that are not legal in XML 1.0. See
> >
> http://lists.w3.org/Archives/Public/www-xml-blueberry-comments/2002May/0000.
> html
A control character may be a character or an embedded signal (i.e. a PI) but it is certainly
not an element.
There is a long history of discussion of using elements to represent characters,
in particular for allowing user-defined characters, and these have always foundered
that it is too bad a fit to use elements where character data is required. There have
been many discussions on the W3C I18n IG, for example.
Furthermore, would we then have pre-control-expansion infosets and post-control-expansion
infosets? (On top of the current pre-|post-[validation|namespace processing|Xinclusion|XML Schema
augmentation] mess)
It would be better to reserve special characters which (like <) are not allowed as literals,
for all the C0 and C1 controls. Or to allow numeric character references, but that is less
tidy, because then people would be tempted to mark-up in code points rather than in
characters. For example,
&BEL;
&NEL;
&EOT;
Cheers
Rick Jelliffe
|